File: tl.py 1 #!/usr/bin/python3 2 3 # The MIT License (MIT) 4 # 5 # Copyright © 2024 pacman64 6 # 7 # Permission is hereby granted, free of charge, to any person obtaining a copy 8 # of this software and associated documentation files (the “Software”), to deal 9 # in the Software without restriction, including without limitation the rights 10 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 11 # copies of the Software, and to permit persons to whom the Software is 12 # furnished to do so, subject to the following conditions: 13 # 14 # The above copyright notice and this permission notice shall be included in 15 # all copies or substantial portions of the Software. 16 # 17 # THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 18 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 19 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 20 # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 21 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 22 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 23 # SOFTWARE. 24 25 26 info = ''' 27 tl [options...] [python expression] [filepaths/URIs...] 28 29 30 Transform Lines runs a Python expression on each line of plain-text data: 31 each expression given emits its result as its own line. Each input line is 32 available to the expression as either `line`, or `l`. Lines are always 33 stripped of any trailing end-of-line bytes/sequences. 34 35 When the expression results in non-string iterable values, a sort of input 36 `amplification` happens for the current input-line, where each item from 37 the result is emitted on its own output line. Dictionaries emit their data 38 as a single JSON line. 39 40 When a formula's result is the None value, it emits no output line, which 41 filters-out the current line, the same way empty-iterable results do. 42 43 When in `all` mode, all input lines are read first into a list of strings, 44 whose items are all stripped of any end-of-line sequences, and kept in the 45 `lines` global variable: the expression given is then run only once. 46 47 Similarly, if the argument before the expression is a single equals sign 48 (a `=`, but without the quotes), no data are read/loaded: the expression is 49 then run only once, effectively acting as a `pure` plain-text generator. 50 51 Current-input names, depending on mode: 52 53 names mode evaluation 54 55 l, line each-line (default) for each input line 56 lines all-lines once, after last input line 57 b, block, p, par block/paragraph for each paragraph of lines 58 v, value jsonl for each input line 59 (no name) no-input once, without an input value 60 61 Modes `each-line` (the default) and `block/paragraph` also define `i` as an 62 integer which starts as 0, and which is incremented after each evaluation. 63 64 Options, where leading double-dashes are also allowed, except for alias `=`: 65 66 -a read all lines at once into a string-list called `lines` 67 -all same as -a 68 -lines same as -a 69 70 -b read uninterrupted blocks/groups of lines as paragraphs 71 -blocks same as -b 72 -g same as -b 73 -groups same as -b 74 -p same as -b 75 -par same as -b 76 -para same as -b 77 -paragraphs same as -b 78 79 -h show this help message 80 -help same as -h 81 82 -jsonl transform JSON Lines into proper JSON 83 84 -nil don't read any input, and run the expression only once 85 -no-input same as -nil 86 -noinput same as -nil 87 -none same as -nil 88 -null same as -nil 89 -null-input same as -nil 90 -nullinput same as -nil 91 = same as -nil 92 93 -p show a performance/time-profile of the full `task` run 94 -prof same as -p 95 -profile same as -p 96 97 -s read each input as a whole string 98 -str same as -s 99 -string same as -s 100 -w same as -s 101 -whole same as -s 102 103 -t show a full traceback of this script for exceptions 104 -trace same as -t 105 -traceback same as -t 106 107 108 Extra Functions 109 110 blue(s) color strings blue, using surrounding ANSI-style sequences 111 gray(s) color strings gray, using surrounding ANSI-style sequences 112 green(s) color strings green, using surrounding ANSI-style sequences 113 highlight(s) highlight strings, using surrounding ANSI-style sequences 114 hilite(s) same as func highlight 115 orange(s) color strings orange, using surrounding ANSI-style sequences 116 purple(s) color strings purple, using surrounding ANSI-style sequences 117 red(s) color strings red, using surrounding ANSI-style sequences 118 119 realign(x, gap=2) pad items across lines, so that all "columns" align 120 121 after(x, y) ignore items until the one given; for strings and sequences 122 afterfinal(x, y) backward counterpart of func after 123 afterlast(x, y) same as func afterfinal 124 arrayish(x) check if value is a list, a tuple, or a generator 125 basename(s) get the final/file part of a pathname 126 before(x, y) ignore items since the one given; for strings and sequences 127 beforefinal(x, y) backward counterpart of func before 128 beforelast(x, y) same as func beforefinal 129 chunk(x, size) split/resequence items into chunks of the length given 130 chunked(x, size) same as func chunk 131 compose(*args) make a func which chain-calls all funcs given 132 composed(*args) same as func compose 133 cond(*args) expression-friendly fully-evaluated if-else chain 134 debase64(s) decode base64 strings, including data-URIs 135 dedup(x) ignore later (re)occurrences of values in a sequence 136 dejson(x, f=None) safe parse JSON from strings 137 denan(x, y) turn a floating-point NaN values into the fallback given 138 denil(*args) return the first non-null/none value among those given 139 denone(*args) same as func denil 140 denull(*args) same as func denil 141 dirname(s) get the folder/directory/parent part of a pathname 142 dive(x, f) transform value in depth-first-recursive fashion 143 divebin(x, y, f) binary (2-input) version of recursive-transform func dive 144 drop(x, *what) ignore keys or substrings; for strings, dicts, dict-lists 145 dropped(x, *v) same as func drop 146 each(x, f) generalization of built-in func map 147 endict(x) turn non-dictionary values into dicts with string keys 148 enfloat(x, f=nan) turn values into floats, offering a fallback on failure 149 enint(x, f=None) turn values into ints, offering a fallback on failure 150 enlist(x) turn non-list values into lists 151 entuple(x) turn non-tuple values into tuples 152 ext(s) return the file-extension part of a pathname, if available 153 fields(s) split fields AWK-style from the string given 154 filtered(x, f) same as func keep 155 flat(*args) flatten everything into an unnested sequence 156 fromto(x, y, ?f) sequence integers, end-value included 157 group(x, ?by) group values into dicts of lists; optional transform func 158 grouped(x, ?by) same as func group 159 harden(f, v) make funcs which return values instead of exceptions 160 hardened(f, v) same as func harden 161 countif(x, f) count how many values make the func given true-like 162 idiota(x, ?f) dict-counterpart of func iota 163 ints(x, y, ?f) make sequences of increasing integers, which include the end 164 iota(x, ?f) make an integer sequence from 1 up to the number given 165 join(x, y) join values into a string; make a dict from keys and values 166 json0(x) turn a value into its smallest JSON-string representation 167 json2(x) turn a value into a 2-space-indented multi-line JSON string 168 jsonl(x) turn a value into a sequence of single-line (JSONL) strings 169 keep(x, pred) generalization of built-in func filter 170 kept(x, pred) same as func keep 171 links(x) auto-detect all hyperlink-like (HTTP/HTTPS) substrings 172 mapped(x, f) same as func each 173 number(x) try to parse as an int, on failure try to parse as a float 174 numbers(x) auto-detect all numbers in the value given 175 numstats(x) calculate various `single-pass` numeric stats 176 once(x, y=None) avoid returning the same value more than once; stateful func 177 pick(x, *what) keep only the keys given; works on dicts, or dict-sequences 178 picked(x, *what) same a func pick 179 plain(s) ignore ANSI-style sequences in strings 180 quoted(s, q='"') surround a string with the (optional) quoting-symbol given 181 recover(*args) recover from exceptions with a fallback value 182 reject(x, pred) generalization of built-in func filter, with opposite logic 183 since(x, y) ignore items before the one given; for strings and sequences 184 sincefinal(x, y) backward counterpart of func since 185 sincelast(x, y) same as func sincefinal 186 split(x, y) split string by separator; split sequence into several ones 187 squeeze(s) strip/trim a string, squishing inner runs of spaces 188 stround(x, d=6) format numbers into decimal-number strings 189 tally(x, ?by) count/tally values, using an optional transformation func 190 tallied(x, ?by) same as func tally 191 trap(x, f=None) try running a func, handing exceptions to a fallback func 192 trycall(*args) same as func recover 193 unique(x) same as func dedup 194 uniqued(x) same as func dedup 195 unjson(x, f=None) same as func dejson 196 unquoted(s) ignore surrounding quotes, if present 197 until(x, y) ignore items after the one given; for strings and sequences 198 untilfinal(x, y) backward counterpart of func until 199 untillast(x, y) same as func untilfinal 200 wait(seconds, x) wait the given number of seconds, before returning a value 201 wat(*args) What Are These (wat) shows help/doc messages for funcs 202 203 204 Examples 205 206 # numbers from 0 to 5, each on its own output line; no input is read/used 207 tl = 'range(6)' 208 209 # all powers up to the 4th, using each input line auto-parsed into a `float` 210 tl = 'range(1, 6)' | tl '(float(l)**p for p in range(1, 4+1))' 211 212 # separate input lines with an empty line between each; global var `empty` 213 # can be used to avoid bothering with nested shell-quoting 214 tl = 'range(6)' | tl '["", l] if i > 0 else l' 215 216 # keep only the last 2 lines from the input 217 tl = 'range(1, 6)' | tl -all 'lines[-2:]' 218 219 # join input lines into tab-separated lines of up to 3 items each; global 220 # var named `tab` can be used to avoid bothering with nested shell-quoting 221 tl = 'range(1, 8)' | tl -all '("\\t".join(c) for c in chunk(lines, 3))' 222 223 # ignore all lines before the first one with just a '5' in it 224 tl = 'range(8)' | tl -all 'since(lines, "5")' 225 226 # ignore errors/exceptions, in favor of the original lines/values 227 tl = '("abc", "123")' | tl 'safe(lambda: 2 * float(line), line)' 228 229 # ignore errors/exceptions, calling a fallback func with the exception 230 tl = '("abc", "123")' | tl 'safe(lambda: 2 * float(line), lambda e: str(e))' 231 232 # filtering lines out via None values 233 head -c 1024 /dev/urandom | strings | tl 'l if len(l) < 20 else None' 234 235 # boolean-valued results are concise ways to filter lines out 236 head -c 1024 /dev/urandom | strings | tl 'len(l) < 20' 237 238 # function/callable results are automatically called on the current line 239 head -c 1024 /dev/urandom | strings | tl len 240 ''' 241 242 243 from sys import argv, exit, stderr, stdin, stdout 244 245 246 if __name__ != '__main__': 247 print('don\'t import this script, run it directly instead', file=stderr) 248 exit(1) 249 250 # no args or a leading help-option arg means show the help message and quit 251 if len(argv) < 2 or argv[1] in ('-h', '--h', '-help', '--help'): 252 from sys import exit, stderr 253 print(info.strip(), file=stderr) 254 exit(0) 255 256 257 from io import StringIO, TextIOWrapper 258 259 from typing import \ 260 AbstractSet, Annotated, Any, AnyStr, \ 261 AsyncContextManager, AsyncGenerator, AsyncIterable, AsyncIterator, \ 262 Awaitable, BinaryIO, ByteString, Callable, cast, \ 263 ClassVar, Collection, Container, \ 264 ContextManager, Coroutine, Deque, Dict, Final, \ 265 final, ForwardRef, FrozenSet, Generator, Generic, get_args, get_origin, \ 266 get_type_hints, Hashable, IO, ItemsView, \ 267 Iterable, Iterator, KeysView, List, Literal, Mapping, \ 268 MappingView, Match, MutableMapping, MutableSequence, MutableSet, \ 269 NamedTuple, NewType, no_type_check, no_type_check_decorator, \ 270 NoReturn, Optional, overload, \ 271 Protocol, Reversible, \ 272 runtime_checkable, Sequence, Set, Sized, SupportsAbs, \ 273 SupportsBytes, SupportsComplex, SupportsFloat, SupportsIndex, \ 274 SupportsInt, SupportsRound, Text, TextIO, Tuple, Type, \ 275 TypedDict, TypeVar, \ 276 TYPE_CHECKING, Union, ValuesView 277 try: 278 from typing import \ 279 assert_never, assert_type, clear_overloads, Concatenate, \ 280 dataclass_transform, get_overloads, is_typeddict, LiteralString, \ 281 Never, NotRequired, ParamSpec, ParamSpecArgs, ParamSpecKwargs, \ 282 Required, reveal_type, Self, TypeAlias, TypeGuard, TypeVarTuple, \ 283 Unpack 284 from typing import \ 285 AwaitableGenerator, override, TypeAliasType, type_check_only 286 except Exception: 287 pass 288 289 290 def conforms(x: Any) -> bool: 291 ''' 292 Check if a value is JSON-compatible, which includes checking values 293 recursively, in case of composite/nestable values. 294 ''' 295 296 if x is None or isinstance(x, (bool, int, str)): 297 return True 298 if isinstance(x, float): 299 return not (isnan(x) or isinf(x)) 300 if isinstance(x, (list, tuple)): 301 return all(conforms(e) for e in x) 302 if isinstance(x, dict): 303 return all(conforms(k) and conforms(v) for k, v in x.items()) 304 return False 305 306 307 def seems_url(s: str) -> bool: 308 protocols = ('https://', 'http://', 'file://', 'ftp://', 'data:') 309 return any(s.startswith(p) for p in protocols) 310 311 312 def disabled_exec(*args, **kwargs) -> None: 313 _ = args 314 _ = kwargs 315 raise Exception('built-in func `exec` is disabled') 316 317 318 def fix_value(x: Any, default: Any) -> Any: 319 'Adapt a value so it can be output.' 320 321 # true shows the current line as the current output; presumably 322 # this is the result of calling a `condition-like` expression 323 if x is True: 324 return default 325 326 # null and false show no output for the current input line 327 if x is False: 328 return None 329 330 if x is type: 331 return type(default).__name__ 332 333 # if expression results in a func, auto-call it with the original data 334 if callable(x) and not isinstance(x, Iterable): 335 c = required_arg_count(x) 336 if c == 1: 337 x = x(default) 338 else: 339 m = f'func auto-call only works with 1-arg funcs (func wanted {c})' 340 raise Exception(m) 341 342 if x is None or isinstance(x, (bool, int, float, str)): 343 return x 344 345 rec = fix_value 346 347 if isinstance(x, dict): 348 return { 349 rec(k, default): rec(v, default) for k, v in x.items() if not 350 (isinstance(k, Skip) or isinstance(v, Skip)) 351 } 352 if isinstance(x, Iterable): 353 return tuple(rec(e, default) for e in x if not isinstance(e, Skip)) 354 355 if isinstance(x, Dottable): 356 return rec(x.__dict__, default) 357 if isinstance(x, DotCallable): 358 return rec(x.value, default) 359 360 if isinstance(x, Exception): 361 raise x 362 363 return None if isinstance(x, Skip) else str(x) 364 365 366 def show_value(w, x: Any) -> None: 367 'Helper func used by func show_result.' 368 369 # null shows no output for the current input line 370 if x is None or isinstance(x, Skip): 371 return 372 373 if isinstance(x, dict): 374 dump(x, w, separators=(', ', ': '), allow_nan=False, indent=None) 375 w.write('\n') 376 w.flush() 377 elif isinstance(x, (bytes, bytearray)): 378 w.write(x) 379 w.flush() 380 elif isinstance(x, Iterable) and not isinstance(x, str): 381 dump(x, w, separators=(', ', ': '), allow_nan=False, indent=None) 382 w.write('\n') 383 w.flush() 384 elif isinstance(x, DotCallable): 385 print(x.value, file=w, flush=True) 386 else: 387 print(x, file=w, flush=True) 388 389 390 def show_result(w, x: Any) -> None: 391 if isinstance(x, (dict, str)): 392 show_value(w, x) 393 elif isinstance(x, Iterable): 394 for e in x: 395 if isinstance(e, Exception): 396 raise e 397 show_value(w, e) 398 else: 399 show_value(w, x) 400 401 402 def make_open_utf8(open: Callable) -> Callable: 403 'Restrict the file-open func to a read-only utf-8 file-open func.' 404 def open_utf8_readonly(name: str): 405 'A UTF-8 read-only file-open func overriding the built-in open func.' 406 return open(name, encoding='utf-8') 407 return open_utf8_readonly 408 409 open_utf8 = make_open_utf8(open) 410 open = open_utf8 411 412 413 def loop_lines_inputs(r, inputs: List[str], doing: Callable) -> None: 414 ''' 415 Act on multiple named inputs line-by-line, via the func given; when 416 not given any named inputs, the default reader given is used instead. 417 ''' 418 419 main_input: List[str] = [] 420 got_main_input = False 421 dashes = inputs.count('-') 422 423 if any(seems_url(e) for e in inputs): 424 from urllib.request import urlopen 425 426 def _adapt_lines(src) -> Iterable[str]: 427 for j, line in enumerate(src): 428 if j == 0: 429 line = line.lstrip('\xef\xbb\xbf') 430 yield line.rstrip('\r\n').rstrip('\n') 431 432 for path in inputs: 433 if path == '-': 434 if dashes == 1: 435 doing(_adapt_lines(r)) 436 continue 437 438 if not got_main_input: 439 main_input = r.read().splitlines() 440 got_main_input = True 441 doing(_adapt_lines(main_input)) 442 continue 443 444 if seems_url(path): 445 with urlopen(path) as inp: 446 with TextIOWrapper(inp, encoding='utf-8') as txt: 447 doing(_adapt_lines(txt)) 448 continue 449 450 with open_utf8(path) as inp: 451 doing(_adapt_lines(inp)) 452 453 if len(inputs) == 0: 454 doing(_adapt_lines(r)) 455 456 457 def loop_whole_inputs(r, inputs: List[str], doing: Callable) -> None: 458 ''' 459 Act on multiple named inputs, read as whole strings, via the func given; 460 when not given any named inputs, the default reader given is used instead. 461 ''' 462 463 main_input: List[str] = [] 464 got_main_input = False 465 dashes = inputs.count('-') 466 467 if any(seems_url(e) for e in inputs): 468 from urllib.request import urlopen 469 470 for path in inputs: 471 if path == '-': 472 if dashes == 1: 473 doing(r.read()) 474 continue 475 476 if not got_main_input: 477 main_input = r.read() 478 got_main_input = True 479 doing(main_input) 480 continue 481 482 if seems_url(path): 483 with urlopen(path) as inp: 484 with TextIOWrapper(inp, encoding='utf-8') as txt: 485 doing(txt.read()) 486 continue 487 488 with open_utf8(path) as inp: 489 doing(inp.read()) 490 491 if len(inputs) == 0: 492 doing(r.read()) 493 494 495 def main_whole_strings(out, r, expression, inputs) -> None: 496 def _each_string(out, src, expression: Any) -> None: 497 # `comprehension` expressions seem to ignore local variables: even 498 # lambda-based workarounds fail 499 global s, t, text, v, value, w, whole, _ 500 501 s = t = text = v = value = w = whole = src 502 res = eval(expression) 503 res = fix_value(res, src) 504 show_result(out, res) 505 _ = res 506 507 loop_whole_inputs(r, inputs, lambda s: _each_string(out, s, expression)) 508 509 510 def main_each_line(w, r, expression, inputs) -> None: 511 def _each_line(w, src, expression: Any) -> None: 512 # `comprehension` expressions seem to ignore local variables: even 513 # lambda-based workarounds fail 514 global i, nr, previous, prev, line, l, line, _ 515 516 previous = '' 517 prev = previous 518 519 for line in src: 520 l = line 521 res = eval(expression) 522 res = fix_value(res, line) 523 show_result(w, res) 524 i += 1 525 nr += 1 526 previous = line 527 prev = previous 528 _ = res 529 530 loop_lines_inputs(r, inputs, lambda r: _each_line(w, r, expression)) 531 532 533 def main_each_block(w, r, expression, inputs) -> None: 534 def _each_block(w, r, expression) -> None: 535 # `comprehension` expressions seem to ignore local variables: even 536 # lambda-based workarounds fail 537 global i, nr 538 global previous, prev, lines, block, par, para, paragraph, _ 539 540 for item in paragraphize(r): 541 lines = block = par = para = paragraph = item 542 res = eval(expression) 543 if isinstance(res, Skip): 544 previous = data 545 prev = previous 546 i += 1 547 nr += 1 548 continue 549 550 res = fix_value(res, lines) 551 show_result(w, res) 552 i += 1 553 nr += 1 554 prev = previous = lines 555 _ = res 556 557 loop_lines_inputs(r, inputs, lambda r: _each_block(w, r, expression)) 558 559 560 def main_all_lines(w, r, expression, inputs) -> None: 561 # `comprehension` expressions seem to ignore local variables: even 562 # lambda-based workarounds fail 563 global line, lines, data, values, d, l, v, dat, val 564 565 def _all_lines(w, r, expression) -> None: 566 # `comprehension` expressions seem to ignore local variables: even 567 # lambda-based workarounds fail 568 global lines, line, l 569 570 for e in r: 571 line = l = e 572 lines.append(line) 573 574 lines = [] 575 line = l = '' 576 loop_lines_inputs(r, inputs, lambda r: _all_lines(w, r, expression)) 577 data = values = d = v = dat = val = lines 578 res = eval(expression) 579 res = fix_value(res, lines) 580 show_result(w, res) 581 582 583 def main_all_bytes(w, r, expression, inputs) -> None: 584 # `comprehension` expressions seem to ignore local variables: even 585 # lambda-based workarounds fail 586 global data, values, d, v, dat, val 587 data = values = d = v = dat = val = r.buffer.read() 588 res = eval(expression) 589 res = fix_value(res, data) 590 show_result(w, res) 591 592 593 def main_no_input(w, r, expression, inputs) -> None: 594 res = eval(expression) 595 fix = lambda x: fix_value(x, None) 596 f = str if res is None or isinstance(res, bool) else fix 597 res = f(res) 598 show_result(w, res) 599 600 601 def main_json_lines(w, r, expression, inputs) -> None: 602 def _jsonl2json(w, src, expression: Any) -> None: 603 # `comprehension` expressions seem to ignore local variables: even 604 # lambda-based workarounds fail 605 global i, nr 606 global line, l, data, d, value, v, dat, val, prev, previous, _ 607 608 previous = None 609 prev = previous 610 611 for line in src: 612 if emptyish_re.match(line) or commented_re.match(line): 613 continue 614 l = line 615 616 data = value = d = v = dat = val = loads(line) 617 res = eval(expression) 618 619 if isinstance(res, Skip): 620 previous = data 621 prev = previous 622 i += 1 623 nr += 1 624 continue 625 res = fix_value(res, data) 626 627 if callable(res): 628 res = res(data) 629 if not conforms(res): 630 res = conform(res) 631 dump(res, w) 632 _ = res 633 w.write('\n') 634 635 previous = data 636 prev = previous 637 i += 1 638 nr += 1 639 640 loop_lines_inputs(r, inputs, lambda r: _jsonl2json(w, r, expression)) 641 642 643 # opts2modes simplifies option-handling in func main 644 opts2modes = { 645 '=': 'no-input', 646 '-nil': 'no-input', 647 '-no-input': 'no-input', 648 '-noinput': 'no-input', 649 '-none': 'no-input', 650 '-None': 'no-input', 651 '-null': 'no-input', 652 '-null-input': 'no-input', 653 '-nullinput': 'no-input', 654 '--n': 'no-input', 655 '--nil': 'no-input', 656 '--no-input': 'no-input', 657 '--noinput': 'no-input', 658 '--none': 'no-input', 659 '--None': 'no-input', 660 '--null': 'no-input', 661 '--null-input': 'no-input', 662 '--nullinput': 'no-input', 663 '-a': 'all-lines', 664 '-all': 'all-lines', 665 '-lines': 'all-lines', 666 '--a': 'all-lines', 667 '--all': 'all-lines', 668 '--lines': 'all-lines', 669 '-b': 'each-block', 670 '-blocks': 'each-block', 671 '-g': 'each-block', 672 '-groups': 'each-block', 673 '-p': 'each-block', 674 '-par': 'each-block', 675 '-para': 'each-block', 676 '-paragraphs': 'each-block', 677 '--b': 'each-block', 678 '--blocks': 'each-block', 679 '--g': 'each-block', 680 '--groups': 'each-block', 681 '--p': 'each-block', 682 '--par': 'each-block', 683 '--para': 'each-block', 684 '--paragraphs': 'each-block', 685 '-bytes': 'bytes', 686 '--bytes': 'bytes', 687 '-jl': 'json-lines', 688 '-jsonl': 'json-lines', 689 '-jsonlines': 'json-lines', 690 '-json-lines': 'json-lines', 691 '--jl': 'json-lines', 692 '--jsonl': 'json-lines', 693 '--jsonlines': 'json-lines', 694 '--json-lines': 'json-lines', 695 '-s': 'whole-strings', 696 '-str': 'whole-strings', 697 '-string': 'whole-strings', 698 '--s': 'whole-strings', 699 '--str': 'whole-strings', 700 '--string': 'whole-strings', 701 '-w': 'whole-strings', 702 '-whole': 'whole-strings', 703 '--w': 'whole-strings', 704 '--whole': 'whole-strings', 705 } 706 707 708 def blue(s: Any) -> str: 709 'Blue-style a plain string via ANSI-style sequences.' 710 return f'\x1b[38;5;26m{s}\x1b[0m' 711 712 def blueback(s: Any) -> str: 713 'Blue-background-style a plain string via ANSI-style sequences.' 714 return f'\x1b[48;5;26m\x1b[38;5;255m{s}\x1b[0m' 715 716 bluebg = blueback 717 718 def bold(s: Any) -> str: 719 'Bold-style a plain string via ANSI-style sequences.' 720 return f'\x1b[1m{s}\x1b[0m' 721 722 def gbm(s: str, good: Any = False, bad: Any = False, meh: Any = False) -> str: 723 ''' 724 Good, Bad, Meh ANSI-styles a plain string via ANSI-style sequences, 725 according to 1..3 conditions given as boolean(ish) values: these are 726 checked in order, so the first truish one wins. 727 ''' 728 729 if good: 730 return green(s) 731 if bad: 732 return red(s) 733 if meh: 734 return gray(s) 735 return s 736 737 def gray(s: Any) -> str: 738 'Gray-style a plain string via ANSI-style sequences.' 739 return f'\x1b[38;5;248m{s}\x1b[0m' 740 741 def grayback(s: Any) -> str: 742 'Gray-background-style a plain string via ANSI-style sequences.' 743 return f'\x1b[48;5;253m{s}\x1b[0m' 744 745 graybg = grayback 746 747 def green(s: Any) -> str: 748 'Green-style a plain string via ANSI-style sequences.' 749 return f'\x1b[38;5;29m{s}\x1b[0m' 750 751 def greenback(s: Any) -> str: 752 'Green-background-style a plain string via ANSI-style sequences.' 753 return f'\x1b[48;5;29m\x1b[38;5;255m{s}\x1b[0m' 754 755 greenbg = greenback 756 757 def highlight(s: Any) -> str: 758 'Highlight/reverse-style a plain string via ANSI-style sequences.' 759 return f'\x1b[7m{s}\x1b[0m' 760 761 hilite = highlight 762 763 def magenta(s: Any) -> str: 764 'Magenta-style a plain string via ANSI-style sequences.' 765 return f'\x1b[38;5;165m{s}\x1b[0m' 766 767 def magentaback(s: Any) -> str: 768 'Magenta-background-style a plain string via ANSI-style sequences.' 769 return f'\x1b[48;5;165m\x1b[38;5;255m{s}\x1b[0m' 770 771 magback = magentaback 772 magbg = magentaback 773 magentabg = magentaback 774 775 def orange(s: Any) -> str: 776 'Orange-style a plain string via ANSI-style sequences.' 777 return f'\x1b[38;5;166m{s}\x1b[0m' 778 779 def orangeback(s: Any) -> str: 780 'Orange-background-style a plain string via ANSI-style sequences.' 781 return f'\x1b[48;5;166m\x1b[38;5;255m{s}\x1b[0m' 782 783 orangebg = orangeback 784 orback = orangeback 785 orbg = orangeback 786 787 def purple(s: Any) -> str: 788 'Purple-style a plain string via ANSI-style sequences.' 789 return f'\x1b[38;5;99m{s}\x1b[0m' 790 791 def purpleback(s: Any) -> str: 792 'Purple-background-style a plain string via ANSI-style sequences.' 793 return f'\x1b[48;5;99m\x1b[38;5;255m{s}\x1b[0m' 794 795 purback = purpleback 796 purbg = purpleback 797 purplebg = purpleback 798 799 def red(s: Any) -> str: 800 'Red-style a plain string via ANSI-style sequences.' 801 return f'\x1b[38;5;1m{s}\x1b[0m' 802 803 def redback(s: Any) -> str: 804 'Red-background-style a plain string via ANSI-style sequences.' 805 return f'\x1b[48;5;1m\x1b[38;5;255m{s}\x1b[0m' 806 807 redbg = redback 808 809 def underline(s: Any) -> str: 810 'Underline-style a plain string via ANSI-style sequences.' 811 return f'\x1b[4m{s}\x1b[0m' 812 813 814 def realign(lines: List[str], gap: int = 2) -> Iterable: 815 ''' 816 Pad lines so that their items align across/vertically: extra padding 817 is put between such `columns`, using 2 spaces by default. 818 ''' 819 820 widths: List[int] = [] 821 for l in lines: 822 items = awk_sep_re.split(l.strip()) 823 while len(widths) < len(items): 824 widths.append(0) 825 for i, s in enumerate(items): 826 widths[i] = max(widths[i], len(s)) 827 828 sb = StringIO() 829 gap = max(gap, 0) 830 831 for l in lines: 832 sb.truncate(0) 833 sb.seek(0) 834 835 padding = 0 836 items = awk_sep_re.split(l.strip()) 837 for s, w in zip(items, widths): 838 sb.write(padding * ' ') 839 sb.write(s) 840 padding = max(w - len(s), 0) + gap 841 842 yield sb.getvalue() 843 844 845 def stop_normal(x: Any, exit_code: int = 0) -> NoReturn: 846 show_result(stdout, fix_value(x, None)) 847 exit(exit_code) 848 849 850 def stop_json(x: Any, exit_code: int = 0) -> NoReturn: 851 dump(x, stdout) 852 stdout.write('\n') 853 stdout.flush() 854 exit(exit_code) 855 856 857 from base64 import \ 858 standard_b64encode, standard_b64decode, \ 859 standard_b64encode as base64bytes, standard_b64decode as debase64bytes 860 861 from collections import \ 862 ChainMap, Counter, defaultdict, deque, namedtuple, OrderedDict, \ 863 UserDict, UserList, UserString 864 865 from copy import copy, deepcopy 866 867 from datetime import \ 868 MAXYEAR, MINYEAR, date, datetime, time, timedelta, timezone, tzinfo 869 try: 870 from datetime import now, UTC 871 except Exception: 872 now = lambda: datetime(2000, 1, 1).now() 873 874 from decimal import Decimal, getcontext 875 876 from difflib import \ 877 context_diff, diff_bytes, Differ, get_close_matches, HtmlDiff, \ 878 IS_CHARACTER_JUNK, IS_LINE_JUNK, ndiff, restore, SequenceMatcher, \ 879 unified_diff 880 881 from fractions import Fraction 882 883 import functools 884 from functools import \ 885 cache, cached_property, cmp_to_key, get_cache_token, lru_cache, \ 886 namedtuple, partial, partialmethod, recursive_repr, reduce, \ 887 singledispatch, singledispatchmethod, total_ordering, update_wrapper, \ 888 wraps 889 890 from glob import glob, iglob 891 892 try: 893 from graphlib import CycleError, TopologicalSorter 894 except Exception: 895 pass 896 897 from hashlib import \ 898 file_digest, md5, pbkdf2_hmac, scrypt, sha1, sha224, sha256, sha384, \ 899 sha512 900 901 from inspect import getfullargspec, getsource 902 903 import itertools 904 from itertools import \ 905 accumulate, chain, combinations, combinations_with_replacement, \ 906 compress, count, cycle, dropwhile, filterfalse, groupby, islice, \ 907 permutations, product, repeat, starmap, takewhile, tee, zip_longest 908 try: 909 from itertools import pairwise 910 from itertools import batched 911 except Exception: 912 pass 913 914 from json import dump, dumps, loads 915 916 import math 917 Math = math 918 from math import \ 919 acos, acosh, asin, asinh, atan, atan2, atanh, ceil, comb, \ 920 copysign, cos, cosh, degrees, dist, e, erf, erfc, exp, expm1, \ 921 fabs, factorial, floor, fmod, frexp, fsum, gamma, gcd, hypot, inf, \ 922 isclose, isfinite, isinf, isnan, isqrt, lcm, ldexp, lgamma, log, \ 923 log10, log1p, log2, modf, nan, nextafter, perm, pi, pow, prod, \ 924 radians, remainder, sin, sinh, sqrt, tan, tanh, tau, trunc, ulp 925 try: 926 from math import cbrt, exp2 927 except Exception: 928 pass 929 930 power = pow 931 932 import operator 933 934 from pathlib import Path 935 936 from pprint import \ 937 isreadable, isrecursive, pformat, pp, pprint, PrettyPrinter, saferepr 938 939 from random import \ 940 betavariate, choice, choices, expovariate, gammavariate, gauss, \ 941 getrandbits, getstate, lognormvariate, normalvariate, paretovariate, \ 942 randbytes, randint, random, randrange, sample, seed, setstate, \ 943 shuffle, triangular, uniform, vonmisesvariate, weibullvariate 944 945 compile_py = compile # keep built-in func compile for later 946 from re import compile as compile_uncached, Pattern, IGNORECASE 947 948 import statistics 949 from statistics import \ 950 bisect_left, bisect_right, fmean, \ 951 geometric_mean, harmonic_mean, mean, median, \ 952 median_grouped, median_high, median_low, mode, multimode, pstdev, \ 953 pvariance, quantiles, stdev, variance 954 try: 955 from statistics import \ 956 correlation, covariance, linear_regression, mul 957 except Exception: 958 pass 959 960 import string 961 from string import \ 962 Formatter, Template, ascii_letters, ascii_lowercase, ascii_uppercase, \ 963 capwords, digits, hexdigits, octdigits, printable, punctuation, \ 964 whitespace 965 966 alphabet = ascii_letters 967 letters = ascii_letters 968 lowercase = ascii_lowercase 969 uppercase = ascii_uppercase 970 971 from textwrap import dedent, fill, indent, shorten, wrap 972 973 from time import \ 974 altzone, asctime, \ 975 ctime, daylight, get_clock_info, \ 976 gmtime, localtime, mktime, monotonic, monotonic_ns, perf_counter, \ 977 perf_counter_ns, process_time, process_time_ns, \ 978 sleep, strftime, strptime, struct_time, thread_time, thread_time_ns, \ 979 time, time_ns, timezone, tzname 980 try: 981 from time import \ 982 clock_getres, clock_gettime, clock_gettime_ns, clock_settime, \ 983 clock_settime_ns, pthread_getcpuclockid, tzset 984 except Exception: 985 pass 986 987 from unicodedata import \ 988 bidirectional, category, combining, decimal, decomposition, digit, \ 989 east_asian_width, is_normalized, lookup, mirrored, name, normalize, \ 990 numeric 991 992 from urllib.parse import \ 993 parse_qs, parse_qsl, quote, quote_from_bytes, quote_plus, unquote, \ 994 unquote_plus, unquote_to_bytes, unwrap, urldefrag, urlencode, urljoin, \ 995 urlparse, urlsplit, urlunparse, urlunsplit 996 997 998 class Skip: 999 'Custom type which some funcs type-check to skip values in containers.' 1000 1001 def __init__(self, *args) -> None: 1002 pass 1003 1004 1005 # skip is a ready-to-use value which some funcs filter against: this way 1006 # filtering values becomes a special case of transforming values 1007 skip = Skip() 1008 1009 # re_cache is used by custom func compile to cache previously-compiled 1010 # regular-expressions, which makes them quicker to (re)use in formulas 1011 re_cache: Dict[str, Pattern] = {} 1012 1013 # ire_cache is like re_cache, except it's for case-insensitive regexes 1014 ire_cache: Dict[str, Pattern] = {} 1015 1016 # ansi_style_re detects the most commonly-used ANSI-style sequences, and 1017 # is used in func plain 1018 ansi_style_re = compile_uncached('\x1b\\[([0-9;]+m|[0-9]*[A-HJKST])') 1019 1020 # number_re detects numbers, and is used in func numbers 1021 number_re = compile_uncached('\\W-?[0-9]+(\\.[0-9]*)?\\W') 1022 1023 # link_re detects web links, and is used in func links 1024 link_re_src = 'https?://[A-Za-z0-9+_.:%-]+(/[A-Za-z0-9+_.%/,#?&=-]*)*' 1025 link_re = compile_uncached(link_re_src) 1026 1027 # paddable_tab_re detects single tabs and possible runs of spaces around 1028 # them, and is used in func squeeze 1029 paddable_tab_re = compile_uncached(' *\t *') 1030 1031 # seen remembers values already given to func `once` 1032 seen = set() 1033 1034 # commented_re detects strings/lines which start as unix-style comments 1035 commented_re = compile_uncached('^ *#') 1036 1037 # emptyish_re detects empty/emptyish strings/lines, the latter being strings 1038 # with only spaces in them 1039 emptyish_re = compile_uncached('^ *\r?\n?$') 1040 1041 # spaces_re detects runs of 2 or more spaces, and is used in func squeeze 1042 spaces_re = compile_uncached(' +') 1043 1044 # awk_sep_re splits like AWK does by default, and is used in func fields 1045 awk_sep_re = compile_uncached(' *\t *| +') 1046 1047 1048 # some convenience aliases to commonly-used values 1049 1050 false = False 1051 true = True 1052 nil = None 1053 nihil = None 1054 none = None 1055 null = None 1056 s = '' 1057 1058 months = [ 1059 'January', 'February', 'March', 'April', 'May', 'June', 1060 'July', 'August', 'September', 'October', 'November', 'December', 1061 ] 1062 1063 monweek = [ 1064 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 1065 'Saturday', 'Sunday', 1066 ] 1067 1068 sunweek = [ 1069 'Sunday', 1070 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 1071 ] 1072 1073 phy = { 1074 'kilo': 1_000, 1075 'mega': 1_000_000, 1076 'giga': 1_000_000_000, 1077 'tera': 1_000_000_000_000, 1078 'peta': 1_000_000_000_000_000, 1079 'exa': 1_000_000_000_000_000_000, 1080 'zetta': 1_000_000_000_000_000_000_000, 1081 1082 'c': 299_792_458, 1083 'kcd': 683, 1084 'na': 602214076000000000000000, 1085 1086 'femto': 1e-15, 1087 'pico': 1e-12, 1088 'nano': 1e-9, 1089 'micro': 1e-6, 1090 'milli': 1e-3, 1091 1092 'e': 1.602176634e-19, 1093 'f': 96_485.33212, 1094 'h': 6.62607015e-34, 1095 'k': 1.380649e-23, 1096 'mu': 1.66053906892e-27, 1097 1098 'ge': 9.7803267715, 1099 'gn': 9.80665, 1100 } 1101 1102 physics = phy 1103 1104 # using literal strings on the cmd-line is often tricky/annoying: some of 1105 # these aliases can help get around multiple levels of string-quoting; no 1106 # quotes are needed as the script will later make these values accessible 1107 # via the property/dot syntax 1108 sym = { 1109 'amp': '&', 1110 'ampersand': '&', 1111 'ansiclear': '\x1b[0m', 1112 'ansinormal': '\x1b[0m', 1113 'ansireset': '\x1b[0m', 1114 'apo': '\'', 1115 'apos': '\'', 1116 'ast': '*', 1117 'asterisk': '*', 1118 'at': '@', 1119 'backquote': '`', 1120 'backslash': '\\', 1121 'backtick': '`', 1122 'ball': '●', 1123 'bang': '!', 1124 'bigsigma': 'Σ', 1125 'block': '█', 1126 'bquo': '`', 1127 'bquote': '`', 1128 'bslash': '\\', 1129 'btick': '`', 1130 'bullet': '•', 1131 'caret': '^', 1132 'cdot': '·', 1133 'circle': '●', 1134 'colon': ':', 1135 'comma': ',', 1136 'cr': '\r', 1137 'crlf': '\r\n', 1138 'cross': '×', 1139 'cs': ', ', 1140 'dash': '—', 1141 'dollar': '$', 1142 'dot': '.', 1143 'dquo': '"', 1144 'dquote': '"', 1145 'emark': '!', 1146 'emdash': '—', 1147 'empty': '', 1148 'endash': '–', 1149 'eq': '=', 1150 'et': '&', 1151 'euro': '€', 1152 'ge': '≥', 1153 'geq': '≥', 1154 'gt': '>', 1155 'hellip': '…', 1156 'hole': '○', 1157 'hyphen': '-', 1158 'infinity': '∞', 1159 'lcurly': '{', 1160 'ldquo': '“', 1161 'ldquote': '“', 1162 'le': '≤', 1163 'leq': '≤', 1164 'lf': '\n', 1165 'lt': '<', 1166 'mdash': '—', 1167 'mdot': '·', 1168 'miniball': '•', 1169 'minus': '-', 1170 'ndash': '–', 1171 'neq': '≠', 1172 'perc': '%', 1173 'percent': '%', 1174 'period': '.', 1175 'plus': '+', 1176 'qmark': '?', 1177 'que': '?', 1178 'rcurly': '}', 1179 'rdquo': '”', 1180 'rdquote': '”', 1181 'sball': '•', 1182 'semi': ';', 1183 'semicolon': ';', 1184 'sharp': '#', 1185 'slash': '/', 1186 'space': ' ', 1187 'square': '■', 1188 'squo': '\'', 1189 'squote': '\'', 1190 'tab': '\t', 1191 'tilde': '~', 1192 'underscore': '_', 1193 'uscore': '_', 1194 'utf8bom': '\xef\xbb\xbf', 1195 'utf16be': '\xfe\xff', 1196 'utf16le': '\xff\xfe', 1197 } 1198 1199 symbols = sym 1200 1201 units = { 1202 'cup2l': 0.23658824, 1203 'floz2l': 0.0295735295625, 1204 'floz2ml': 29.5735295625, 1205 'ft2m': 0.3048, 1206 'gal2l': 3.785411784, 1207 'in2cm': 2.54, 1208 'lb2kg': 0.45359237, 1209 'mi2km': 1.609344, 1210 'mpg2kpl': 0.425143707, 1211 'nmi2km': 1.852, 1212 'oz2g': 28.34952312, 1213 'psi2pa': 6894.757293168, 1214 'ton2kg': 907.18474, 1215 'yd2m': 0.9144, 1216 1217 'mol': 602214076000000000000000, 1218 'mole': 602214076000000000000000, 1219 1220 'hour': 3_600, 1221 'day': 86_400, 1222 'week': 604_800, 1223 1224 'hr': 3_600, 1225 'wk': 604_800, 1226 1227 'kb': 1024, 1228 'mb': 1024**2, 1229 'gb': 1024**3, 1230 'tb': 1024**4, 1231 'pb': 1024**5, 1232 } 1233 1234 # some convenience aliases to various funcs from the python stdlib 1235 geomean = geometric_mean 1236 harmean = harmonic_mean 1237 sd = stdev 1238 popsd = pstdev 1239 var = variance 1240 popvar = pvariance 1241 randbeta = betavariate 1242 randexp = expovariate 1243 randgamma = gammavariate 1244 randlognorm = lognormvariate 1245 randnorm = normalvariate 1246 randweibull = weibullvariate 1247 1248 capitalize = str.capitalize 1249 casefold = str.casefold 1250 center = str.center 1251 # count = str.count 1252 decode = bytes.decode 1253 encode = str.encode 1254 endswith = str.endswith 1255 expandtabs = str.expandtabs 1256 find = str.find 1257 format = str.format 1258 index = str.index 1259 isalnum = str.isalnum 1260 isalpha = str.isalpha 1261 isascii = str.isascii 1262 isdecimal = str.isdecimal 1263 isdigit = str.isdigit 1264 isidentifier = str.isidentifier 1265 islower = str.islower 1266 isnumeric = str.isnumeric 1267 isprintable = str.isprintable 1268 isspace = str.isspace 1269 istitle = str.istitle 1270 isupper = str.isupper 1271 # join = str.join 1272 ljust = str.ljust 1273 lower = str.lower 1274 lowered = str.lower 1275 lstrip = str.lstrip 1276 maketrans = str.maketrans 1277 partition = str.partition 1278 removeprefix = str.removeprefix 1279 removesuffix = str.removesuffix 1280 replace = str.replace 1281 rfind = str.rfind 1282 rindex = str.rindex 1283 rjust = str.rjust 1284 rpartition = str.rpartition 1285 rsplit = str.rsplit 1286 rstrip = str.rstrip 1287 # split = str.split 1288 splitlines = str.splitlines 1289 startswith = str.startswith 1290 strip = str.strip 1291 swapcase = str.swapcase 1292 title = str.title 1293 translate = str.translate 1294 upper = str.upper 1295 uppered = str.upper 1296 zfill = str.zfill 1297 1298 every = all 1299 rev = reversed 1300 reverse = reversed 1301 some = any 1302 1303 length = len 1304 1305 blowtabs = str.expandtabs 1306 hasprefix = str.startswith 1307 hassuffix = str.endswith 1308 ltrim = str.lstrip 1309 stripstart = str.lstrip 1310 trimspace = str.strip 1311 trimstart = str.lstrip 1312 rtrim = str.rstrip 1313 stripend = str.rstrip 1314 trimend = str.rstrip 1315 stripped = str.strip 1316 trim = str.strip 1317 trimmed = str.strip 1318 trimprefix = str.removeprefix 1319 trimsuffix = str.removesuffix 1320 1321 1322 def required_arg_count(f: Callable) -> int: 1323 if isinstance(f, type): 1324 return 1 1325 1326 meta = getfullargspec(f) 1327 n = len(meta.args) 1328 if meta.defaults: 1329 n -= len(meta.defaults) 1330 return n 1331 1332 1333 def identity(x: Any) -> Any: 1334 ''' 1335 Return the value given: this is the default transformer for several 1336 higher-order funcs, which effectively keeps original items as given. 1337 ''' 1338 return x 1339 1340 idem = identity 1341 iden = identity 1342 1343 1344 def after(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 1345 'Skip parts of strings/sequences up to the substring/value given.' 1346 return (strafter if isinstance(x, str) else itemsafter)(x, what) 1347 1348 def afterlast(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 1349 'Skip parts of strings/sequences up to the last substring/value given.' 1350 return (strafterlast if isinstance(x, str) else itemsafterlast)(x, what) 1351 1352 afterfinal = afterlast 1353 1354 def arrayish(x: Any) -> bool: 1355 'Check if a value is array-like enough.' 1356 return isinstance(x, (list, tuple, range, Generator)) 1357 1358 isarrayish = arrayish 1359 1360 def base64(x): 1361 return base64bytes(str(x).encode()).decode() 1362 1363 def basename(s: str) -> str: 1364 'Get a filepath\'s last part, if present.' 1365 return Path(s).name 1366 1367 def before(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 1368 'End strings/sequences right before a substring/value\'s appearance.' 1369 return (strbefore if isinstance(x, str) else itemsbefore)(x, what) 1370 1371 def beforelast(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 1372 'End strings/sequences right before a substring/value\'s last appearance.' 1373 return (strbeforelast if isinstance(x, str) else itemsbeforelast)(x, what) 1374 1375 beforefinal = beforelast 1376 1377 def cases(x: Any, *args: Any) -> Any: 1378 ''' 1379 Simulate a switch statement on a value, using matches/result pairs from 1380 the arguments given; when given an even number of extra args, None is 1381 used as a final fallback result; when given an odd number of extra args, 1382 the last argument is used as a final `default` value, if needed. 1383 ''' 1384 1385 for i in range(0, len(args) - len(args) % 2, 2): 1386 test, res = args[i], args[i+1] 1387 if isinstance(test, (list, tuple)) and x in test: 1388 return res 1389 if isinstance(test, float) and isnan(test) and isnan(x): 1390 return res 1391 if x == test: 1392 return res 1393 return None if len(args) % 2 == 0 else args[-1] 1394 1395 switch = cases 1396 1397 def chunk(items: Iterable, chunk_size: int) -> Iterable: 1398 'Break iterable into chunks, each with up to the item-count given.' 1399 1400 if isinstance(items, str): 1401 n = len(items) 1402 while n >= chunk_size: 1403 yield items[:chunk_size] 1404 items = items[chunk_size:] 1405 n -= chunk_size 1406 if n > 0: 1407 yield items 1408 return 1409 1410 if not isinstance(chunk_size, int): 1411 raise Exception('non-integer chunk-size') 1412 if chunk_size < 1: 1413 raise Exception('non-positive chunk-size') 1414 1415 it = iter(items) 1416 while True: 1417 head = tuple(islice(it, chunk_size)) 1418 if not head: 1419 return 1420 yield head 1421 1422 chunked = chunk 1423 1424 def commented(s: str) -> bool: 1425 'Check if a string starts as a unix-style comment.' 1426 return commented_re.match(s) != None 1427 1428 iscommented = commented 1429 1430 def compile(s: str, case_sensitive: bool = True) -> Pattern: 1431 'Cached regex `compiler`, so it\'s quicker to (re)use in formulas.' 1432 1433 cache = re_cache if case_sensitive else ire_cache 1434 options = 0 if case_sensitive else IGNORECASE 1435 1436 if s in cache: 1437 return cache[s] 1438 e = compile_uncached(s, options) 1439 cache[s] = e 1440 return e 1441 1442 def compose(*what: Callable) -> Callable: 1443 def composite(x: Any) -> Any: 1444 for f in what: 1445 x = f(x) 1446 return x 1447 return composite 1448 1449 composed = compose 1450 lcompose = compose 1451 lcomposed = compose 1452 1453 def cond(*args: Any) -> Any: 1454 ''' 1455 Simulate a chain of if-else statements, using condition/result pairs 1456 from the arguments given; when given an even number of args, None is 1457 used as a final fallback result; when given an odd number of args, the 1458 last argument is used as a final `else` value, if needed. 1459 ''' 1460 1461 for i in range(0, len(args) - len(args) % 2, 2): 1462 if args[i]: 1463 return args[i+1] 1464 return None if len(args) % 2 == 0 else args[-1] 1465 1466 def conform(x: Any, denan: Any = None, deinf: Any = None, fn = str) -> Any: 1467 'Make values JSON-compatible.' 1468 1469 if isinstance(x, float): 1470 # turn NaNs and Infinities into the replacement values given 1471 if isnan(x): 1472 return denan 1473 if isinf(x): 1474 return deinf 1475 return x 1476 1477 if isinstance(x, (bool, int, float, str)): 1478 return x 1479 1480 if isinstance(x, dict): 1481 return { 1482 str(k): conform(v) for k, v in x.items() if not 1483 (isinstance(k, Skip) or isinstance(v, Skip)) 1484 } 1485 1486 if isinstance(x, Iterable): 1487 return [conform(e) for e in x if not isinstance(e, Skip)] 1488 1489 if isinstance(x, DotCallable): 1490 return x.value 1491 1492 return fn(x) 1493 1494 fix = conform 1495 1496 def countif(src: Iterable, check: Callable) -> int: 1497 ''' 1498 Count how many values make the func given true-like. This func works with 1499 sequences, dictionaries, and strings. 1500 ''' 1501 1502 if callable(src): 1503 src, check = check, src 1504 check = predicate(check) 1505 1506 total = 0 1507 if isinstance(src, dict): 1508 for v in src.values(): 1509 if check(v): 1510 total += 1 1511 else: 1512 for v in src: 1513 if check(v): 1514 total += 1 1515 return total 1516 1517 # def debase64(x): 1518 # return debase64bytes(str(x).encode()).decode() 1519 1520 def debase64(s: str) -> bytes: 1521 'Convert away from base64 encoding, including data-URIs.' 1522 1523 if s.startswith('data:'): 1524 i = s.find(',') 1525 if i >= 0: 1526 return standard_b64decode(s[i + 1:]) 1527 return standard_b64decode(s) 1528 1529 unbase64 = debase64 1530 1531 def dedup(v: Iterable) -> Iterable: 1532 'Ignore reappearing items from iterables, after their first occurrence.' 1533 1534 got = set() 1535 for e in v: 1536 if not e in got: 1537 got.add(e) 1538 yield e 1539 1540 dedupe = dedup 1541 deduped = dedup 1542 deduplicate = dedup 1543 deduplicated = dedup 1544 undup = dedup 1545 undupe = dedup 1546 unduped = dedup 1547 unduplicate = dedup 1548 unduplicated = dedup 1549 unique = dedup 1550 uniqued = dedup 1551 1552 def defunc(x: Any) -> Any: 1553 'Call if value is a func, or return it back as given.' 1554 return x() if callable(x) else x 1555 1556 callmemaybe = defunc 1557 defunct = defunc 1558 unfunc = defunc 1559 unfunct = defunc 1560 1561 def dejson(x: Any, catch: Union[Callable[[Exception], Any], Any] = None) -> Any: 1562 'Safely parse JSON from strings.' 1563 try: 1564 return loads(x) if isinstance(x, str) else x 1565 except Exception as e: 1566 return catch(e) if callable(catch) else catch 1567 1568 unjson = dejson 1569 1570 def denan(x: Any, fallback: Any = None) -> Any: 1571 'Replace floating-point NaN with the alternative value given.' 1572 return x if not (isinstance(x, float) and isnan(x)) else fallback 1573 1574 def denil(*args: Any) -> Any: 1575 'Avoid None values, if possible: first value which isn\'t None wins.' 1576 for e in args: 1577 if e != None: 1578 return e 1579 return None 1580 1581 denone = denil 1582 denull = denil 1583 1584 def dirname(s: str) -> str: 1585 'Ignore the last part of a filepath.' 1586 return str(Path(s).parent) 1587 1588 def dive(into: Any, doing: Callable) -> Any: 1589 'Transform a nested value by calling a func via depth-first recursion.' 1590 1591 # support args in either order 1592 if callable(into): 1593 into, doing = doing, into 1594 1595 return _dive_kv(None, into, doing) 1596 1597 deepmap = dive 1598 dive1 = dive 1599 1600 def divebin(x: Any, y: Any, doing: Callable) -> Any: 1601 'Nested 2-value version of depth-first-recursive func dive.' 1602 1603 # support args in either order 1604 if callable(x): 1605 x, y, doing = y, doing, x 1606 1607 narg = required_arg_count(doing) 1608 if narg == 2: 1609 return dive(x, lambda a: dive(y, lambda b: doing(a, b))) 1610 if narg == 4: 1611 return dive(x, lambda i, a: dive(y, lambda j, b: doing(i, a, j, b))) 1612 raise Exception('divebin(...) only supports funcs with 2 or 4 args') 1613 1614 bindive = divebin 1615 # diveboth = divebin 1616 # dualdive = divebin 1617 # duodive = divebin 1618 dive2 = divebin 1619 1620 def _dive_kv(key: Any, into: Any, doing: Callable) -> Any: 1621 if isinstance(into, dict): 1622 return {k: _dive_kv(k, v, doing) for k, v in into.items()} 1623 if isinstance(into, Iterable) and not isinstance(into, str): 1624 return [_dive_kv(i, e, doing) for i, e in enumerate(into)] 1625 1626 narg = required_arg_count(doing) 1627 return doing(key, into) if narg == 2 else doing(into) 1628 1629 class DotCallable: 1630 'Enable convenient dot-syntax calling of 1-input funcs.' 1631 1632 def __init__(self, value: Any): 1633 self.value = value 1634 1635 def __getattr__(self, key: str) -> Any: 1636 return DotCallable(globals()[key](self.value)) 1637 1638 class Dottable: 1639 'Enable convenient dot-syntax access to dictionary values.' 1640 1641 def __getattr__(self, key: Any) -> Any: 1642 return self.__dict__[key] if key in self.__dict__ else None 1643 1644 def __getitem__(self, key: Any) -> Any: 1645 return self.__dict__[key] if key in self.__dict__ else None 1646 1647 def __iter__(self) -> Iterable: 1648 return iter(self.__dict__) 1649 1650 def dotate(x: Any) -> Union[Dottable, Any]: 1651 'Recursively ensure all dictionaries in a value are dot-accessible.' 1652 1653 if isinstance(x, dict): 1654 d = Dottable() 1655 d.__dict__ = {k: dotate(v) for k, v in x.items()} 1656 return d 1657 if isinstance(x, list): 1658 return [dotate(e) for e in x] 1659 if isinstance(x, tuple): 1660 return tuple(dotate(e) for e in x) 1661 return x 1662 1663 dotated = dotate 1664 dote = dotate 1665 doted = dotate 1666 dotified = dotate 1667 dotify = dotate 1668 dottified = dotate 1669 dottify = dotate 1670 1671 # make dictionaries `physics`, `symbols`, and `units` easier to use 1672 phy = dotate(phy) 1673 physics = phy 1674 sym = dotate(sym) 1675 symbols = sym 1676 units = dotate(units) 1677 1678 def drop(src: Any, *what) -> Any: 1679 ''' 1680 Either ignore all substrings occurrences, or ignore all keys given from 1681 an object, or even from a sequence of objects. 1682 ''' 1683 1684 if isinstance(src, str): 1685 return strdrop(src, *what) 1686 return _itemsdrop(src, set(what)) 1687 1688 dropped = drop 1689 # ignore = drop 1690 # ignored = drop 1691 1692 def _itemsdrop(src: Any, what: Set) -> Any: 1693 if isinstance(src, dict): 1694 kv = {} 1695 for k, v in src.items(): 1696 if not (k in what): 1697 kv[k] = v 1698 return kv 1699 1700 if isinstance(src, Iterable): 1701 return [_itemsdrop(e, what) for e in src] 1702 1703 return None 1704 1705 def each(src: Iterable, f: Callable) -> Any: 1706 ''' 1707 A generalization of built-in func map, which can also handle dictionaries 1708 and strings. 1709 ''' 1710 1711 if callable(src): 1712 src, f = f, src 1713 1714 if isinstance(src, dict): 1715 return mapkv(src, lambda k, _: k, f) 1716 1717 if isinstance(src, str): 1718 s = StringIO() 1719 f = loopify(f) 1720 for i, c in enumerate(src): 1721 v = f(i, c) 1722 if not isinstance(v, Skip): 1723 s.write(str(v)) 1724 return s.getvalue() 1725 1726 return tuple(f(i, v) for i, v in enumerate(src)) 1727 1728 mapped = each 1729 1730 def emptyish(x: Any) -> bool: 1731 ''' 1732 Check if a value can be considered empty, which includes non-empty 1733 strings which only have spaces in them. 1734 ''' 1735 1736 def check(x: Any) -> bool: 1737 if not x: 1738 return True 1739 if isinstance(x, str): 1740 return bool(emptyish_re.match(x)) 1741 return False 1742 1743 if check(x): 1744 return True 1745 if isinstance(x, Iterable): 1746 return all(check(e) for e in x) 1747 return False 1748 1749 isemptyish = emptyish 1750 1751 def endict(x: Any) -> Dict[str, Any]: 1752 'Turn non-dictionary values into dictionaries with string keys.' 1753 1754 if isinstance(x, dict): 1755 return {str(k): v for k, v in x.items()} 1756 if arrayish(x): 1757 return {str(e): e for e in x} 1758 return {str(x): x} 1759 1760 dicted = endict 1761 endicted = endict 1762 indict = endict 1763 todict = endict 1764 1765 def enfloat(x: Any, fallback: float = nan) -> float: 1766 try: 1767 return float(x) 1768 except Exception: 1769 return fallback 1770 1771 enfloated = enfloat 1772 floated = enfloat 1773 floatify = enfloat 1774 floatize = enfloat 1775 tofloat = enfloat 1776 1777 def enint(x: Any, fallback: Any = None) -> Any: 1778 try: 1779 return int(x) 1780 except Exception: 1781 return fallback 1782 1783 eninted = enint 1784 inted = enint 1785 integered = enint 1786 intify = enint 1787 intize = enint 1788 toint = enint 1789 1790 def enlist(x: Any) -> List[Any]: 1791 'Turn non-list values into lists.' 1792 return list(x) if arrayish(x) else [x] 1793 1794 # inlist = enlist 1795 enlisted = enlist 1796 listify = enlist 1797 listize = enlist 1798 tolist = enlist 1799 1800 def entuple(x: Any) -> Tuple[Any, ...]: 1801 'Turn non-tuple values into tuples.' 1802 return tuple(x) if arrayish(x) else (x, ) 1803 1804 entupled = entuple 1805 ntuple = entuple 1806 ntupled = entuple 1807 tuplify = entuple 1808 tuplize = entuple 1809 toentuple = entuple 1810 tontuple = entuple 1811 totuple = entuple 1812 1813 def error(message: Any) -> Exception: 1814 return Exception(str(message)) 1815 1816 err = error 1817 1818 def ext(s: str) -> str: 1819 'Get a filepath\'s extension, if present.' 1820 1821 name = Path(s).name 1822 i = name.rfind('.') 1823 return name[i:] if i >= 0 else '' 1824 1825 filext = ext 1826 1827 def fail(message: Any, error_code: int = 255) -> NoReturn: 1828 stdout.flush() 1829 print(f'\x1b[31m{message}\x1b[0m', file=stderr) 1830 quit(error_code) 1831 1832 abort = fail 1833 bail = fail 1834 1835 def fields(s: str) -> Iterable[str]: 1836 'Split fields AWK-style from the string given.' 1837 return awk_sep_re.split(s.strip()) 1838 1839 # items = fields 1840 splitfields = fields 1841 splititems = fields 1842 words = fields 1843 1844 def first(items: SupportsIndex, fallback: Any = None) -> Any: 1845 return items[0] if len(items) > 0 else fallback 1846 1847 def flappend(*args: Any) -> List[Any]: 1848 'Turn arbitrarily-nested values/sequences into a single flat sequence.' 1849 1850 flat = [] 1851 def dig(x: Any) -> None: 1852 if arrayish(x): 1853 for e in x: 1854 dig(e) 1855 elif isinstance(x, dict): 1856 for e in x.values(): 1857 dig(e) 1858 else: 1859 flat.append(x) 1860 1861 for e in args: 1862 dig(e) 1863 return flat 1864 1865 def flat(*args: Any) -> Iterable: 1866 'Turn arbitrarily-nested values/sequences into a single flat sequence.' 1867 1868 def _flat_rec(x: Any) -> Iterable: 1869 if x is None: 1870 return 1871 1872 if isinstance(x, dict): 1873 yield from _flat_rec(x.values()) 1874 1875 if isinstance(x, str): 1876 yield x 1877 return 1878 1879 if isinstance(x, Iterable): 1880 for e in x: 1881 yield from _flat_rec(e) 1882 return 1883 1884 yield x 1885 1886 for x in args: 1887 yield from _flat_rec(x) 1888 1889 flatten = flat 1890 flattened = flat 1891 1892 def fromto(start, stop, f: Callable = identity) -> Iterable: 1893 'Sequence all integers between the numbers given, end-value included.' 1894 return (f(e) for e in range(start, stop + 1)) 1895 1896 def fuzz(x: Union[int, float]) -> Union[float, Dict[str, float]]: 1897 ''' 1898 Deapproximate numbers to their max range before approximation: the 1899 result is a dictionary with the guessed lower-bound number, the number 1900 given, and the guessed upper-bound number which can approximate to the 1901 original number given. NaNs and the infinities are returned as given, 1902 instead of resulting in a dictionary. 1903 ''' 1904 1905 if isnan(x) or isinf(x): 1906 return x 1907 1908 if x == 0: 1909 return {'-0.5': -0.5, '0': 0.0, '0.5': +0.5} 1910 1911 if x % 1 != 0: 1912 # return surrounding integers when given non-integers 1913 a = floor(x) 1914 b = ceil(x) 1915 return {str(a): a, str(x): x, str(b): b} 1916 1917 if x % 10 != 0: 1918 a = x - 0.5 1919 b = x + 0.5 1920 return {str(a): a, str(x): x, str(b): b} 1921 1922 # find the integer log10 of the absolute value; 0 was handled previously 1923 y = int(abs(x)) 1924 p10 = 1 1925 while True: 1926 if y % p10 != 0: 1927 p10 /= 10 1928 break 1929 p10 *= 10 1930 delta = p10 / 2 1931 1932 s = +1 if x > 0 else -1 1933 ux = abs(x) 1934 a = s * ux - delta 1935 b = s * ux + delta 1936 return {str(a): a, str(x): x, str(b): b} 1937 1938 def generated(src: Any) -> Any: 1939 'Make tuples out of generators, or return non-generator values as given.' 1940 return tuple(src) if isinstance(src, (Generator, range)) else src 1941 1942 concrete = generated 1943 concreted = generated 1944 concretize = generated 1945 concretized = generated 1946 degen = generated 1947 degenerate = generated 1948 degenerated = generated 1949 degenerator = generated 1950 gen = generated 1951 generate = generated 1952 synth = generated 1953 synthed = generated 1954 synthesize = generated 1955 synthesized = generated 1956 1957 def group(src: Iterable, by: Callable = identity) -> Dict: 1958 ''' 1959 Separate transformed items into arrays, the final result being a dict 1960 whose keys are all the transformed values, and whose values are lists 1961 of all the original values which did transform to their group's key. 1962 ''' 1963 1964 if callable(src): 1965 src, by = by, src 1966 1967 by = loopify(by) 1968 kv = src.items() if isinstance(src, dict) else enumerate(src) 1969 1970 groups = {} 1971 for k, v in kv: 1972 dk = by(k, v) 1973 if isinstance(dk, Skip) or isinstance(v, Skip): 1974 continue 1975 if dk in groups: 1976 groups[dk].append(v) 1977 else: 1978 groups[dk] = [v] 1979 return groups 1980 1981 grouped = group 1982 1983 def gire(src: Iterable[str], using: Iterable[str], fallback: Any = '') -> Dict: 1984 ''' 1985 Group matched items into arrays, the final result being a dict whose 1986 keys are all the matchable regexes given, and whose values are lists 1987 of all the original values which did case-insensitively match their 1988 group's key as a regex. 1989 ''' 1990 1991 using = tuple(using) 1992 return group(src, lambda x: imatch(x, using, fallback)) 1993 1994 gbire = gire 1995 groupire = gire 1996 1997 def gre(src: Iterable[str], using: Iterable[str], fallback: Any = '') -> Dict: 1998 ''' 1999 Group matched items into arrays, the final result being a dict whose 2000 keys are all the matchable regexes given, and whose values are lists 2001 of all the original values which did regex-match their group's key. 2002 ''' 2003 2004 using = tuple(using) 2005 return group(src, lambda x: match(x, using, fallback)) 2006 2007 gbre = gre 2008 groupre = gre 2009 2010 def gsub(s: str, what: str, repl: str) -> str: 2011 'Replace all regex-matches with the string given.' 2012 return compile(what).sub(repl, s) 2013 2014 def harden(f: Callable, fallback: Any = None) -> Callable: 2015 def _hardened_caller(*args): 2016 try: 2017 return f(*args) 2018 except Exception: 2019 return fallback 2020 return _hardened_caller 2021 2022 hardened = harden 2023 insure = harden 2024 insured = harden 2025 2026 def horner(coeffs: List[float], x: Union[int, float]) -> float: 2027 if isinstance(coeffs, (int, float)): 2028 coeffs, x = x, coeffs 2029 2030 if len(coeffs) == 0: 2031 return 0 2032 2033 y = coeffs[0] 2034 for c in islice(coeffs, 1, None): 2035 y *= x 2036 y += c 2037 return y 2038 2039 polyval = horner 2040 2041 def idiota(n: int, f: Callable = identity) -> Dict[int, int]: 2042 'ID (keys) version of func iota.' 2043 return { v: v for v in (f(e) for e in range(1, n + 1))} 2044 2045 dictiota = idiota 2046 kviota = idiota 2047 2048 def imatch(what: str, using: Iterable[str], fallback: str = '') -> str: 2049 'Try to case-insensitively match a string with any of the regexes given.' 2050 2051 if not isinstance(what, str): 2052 what, using = using, what 2053 2054 for s in using: 2055 expr = compile(s, False) 2056 m = expr.search(what) 2057 if m: 2058 # return what[m.start():m.end()] 2059 return s 2060 return fallback 2061 2062 def indices(x: Any) -> Iterable[Any]: 2063 'List all indices/keys, or get an exclusive range from an int.' 2064 2065 if isinstance(x, int): 2066 return range(x) 2067 if isinstance(x, dict): 2068 return x.keys() 2069 if isinstance(x, (str, list, tuple)): 2070 return range(len(x)) 2071 return tuple() 2072 2073 keys = indices 2074 2075 def ints(start, stop, f: Callable = identity) -> Iterable[int]: 2076 'Sequence integers, end-value included.' 2077 2078 if isnan(start) or isnan(stop) or isinf(start) or isinf(stop): 2079 return tuple() 2080 return (f(e) for e in range(int(ceil(start)), int(stop) + 1)) 2081 2082 integers = ints 2083 2084 def iota(n: int, f: Callable = identity) -> Iterable[int]: 2085 'Sequence all integers from 1 up to (and including) the int given.' 2086 return (f(e) for e in range(1, n + 1)) 2087 2088 def itemsafter(x: Iterable, what: Any) -> Iterable: 2089 ok = False 2090 check = predicate(what) 2091 for e in x: 2092 if ok: 2093 yield e 2094 elif check(e): 2095 ok = True 2096 2097 def itemsafterlast(x: Iterable, what: Any) -> Iterable: 2098 rest: List[Any] = [] 2099 check = predicate(what) 2100 for e in x: 2101 if check(e): 2102 rest.clear() 2103 else: 2104 rest.append(e) 2105 2106 for e in islice(rest, 1, len(rest)): 2107 yield e 2108 2109 def itemsbefore(x: Iterable, what: Any) -> Iterable: 2110 check = predicate(what) 2111 for e in x: 2112 if check(e): 2113 return 2114 yield e 2115 2116 def itemsbeforelast(x: Iterable, what: Any) -> Iterable: 2117 items = [] 2118 for e in x: 2119 items.append(e) 2120 2121 i = -1 2122 check = predicate(what) 2123 for j, e in enumerate(reversed(items)): 2124 if check(e): 2125 i = j 2126 break 2127 2128 if i < 0: 2129 return items 2130 if i == 0: 2131 return tuple() 2132 for e in islice(items, 0, i): 2133 yield e 2134 2135 def itemssince(x: Iterable, what: Any) -> Iterable: 2136 ok = False 2137 check = predicate(what) 2138 for e in x: 2139 ok = ok or check(e) 2140 if ok: 2141 yield e 2142 2143 def itemssincelast(x: Iterable, what: Any) -> Iterable: 2144 rest: List[Any] = [] 2145 check = predicate(what) 2146 for e in x: 2147 if check(e): 2148 rest.clear() 2149 else: 2150 rest.append(e) 2151 return rest 2152 2153 def itemsuntil(x: Iterable, what: Any) -> Iterable: 2154 check = predicate(what) 2155 for e in x: 2156 yield e 2157 if check(e): 2158 return 2159 2160 def itemsuntillast(x: Iterable, what: Any) -> Iterable: 2161 items = [] 2162 for e in x: 2163 items.append(e) 2164 2165 i = -1 2166 check = predicate(what) 2167 for j, e in enumerate(reversed(items)): 2168 if check(e): 2169 i = j 2170 break 2171 2172 if i < 0: 2173 return items 2174 for e in islice(items, 0, i + 1): 2175 yield e 2176 2177 itemsuntilfinal = itemsuntillast 2178 2179 def join(items: Iterable, sep: Union[str, Iterable] = ' ') -> Union[str, Dict]: 2180 ''' 2181 Join iterables using the separator-string given: its 2 arguments 2182 can come in either order, and are sorted out if needed. When given 2183 2 non-string iterables, the result is an object whose keys are from 2184 the first argument, and whose values are from the second one. 2185 2186 You can use it any of the following ways, where `keys` and `values` are 2187 sequences (lists, tuples, or generators), and `separator` is a string: 2188 2189 join(values) 2190 join(values, separator) 2191 join(separator, values) 2192 join(keys, values) 2193 ''' 2194 2195 if arrayish(items) and arrayish(sep): 2196 return {k: v for k, v in zip(items, sep)} 2197 if isinstance(items, str): 2198 items, sep = sep, items 2199 return sep.join(str(e) for e in items) 2200 2201 def joined_paragraphs(lines: Iterable[str]) -> Iterable[Sequence[str]]: 2202 ''' 2203 Regroup lines into individual paragraphs, each of which can span multiple 2204 lines: such paragraphs have no empty lines in them, and never end with a 2205 trailing line-feed. 2206 ''' 2207 2208 par: List[str] = [] 2209 for l in lines: 2210 if (not l) and par: 2211 yield '\n'.join(par) 2212 par.clear() 2213 else: 2214 par.append(l) 2215 2216 if len(par) > 0: 2217 yield '\n'.join(par) 2218 2219 def json0(x: Any) -> str: 2220 'Encode value into a minimal single-line JSON string.' 2221 return dumps(x, separators=(',', ':'), allow_nan=False, indent=None) 2222 2223 j0 = json0 2224 2225 def json2(x: Any) -> str: 2226 ''' 2227 Encode value into a (possibly multiline) JSON string, using 2 spaces for 2228 each indentation level. 2229 ''' 2230 return dumps(x, separators=(',', ': '), allow_nan=False, indent=2) 2231 2232 j2 = json2 2233 2234 def jsonl(x: Any) -> Iterable: 2235 'Turn value into multiple JSON-encoded strings, known as JSON Lines.' 2236 2237 if x is None: 2238 yield dumps(x, allow_nan=False) 2239 elif isinstance(x, (bool, int, float, dict, str)): 2240 yield dumps(x, allow_nan=False) 2241 elif isinstance(x, Iterable): 2242 for e in x: 2243 yield dumps(e, allow_nan=False) 2244 else: 2245 yield dumps(str(x), allow_nan=False) 2246 2247 jsonlines = jsonl 2248 tojsonl = jsonl 2249 tojsonlines = jsonl 2250 2251 def keep(src: Iterable, pred: Any) -> Iterable: 2252 ''' 2253 A generalization of built-in func filter, which can also handle dicts and 2254 strings. 2255 ''' 2256 2257 if callable(src): 2258 src, pred = pred, src 2259 pred = predicate(pred) 2260 pred = loopify(pred) 2261 2262 if isinstance(src, str): 2263 out = StringIO() 2264 for i, c in enumerate(src): 2265 if pred(i, c): 2266 out.write(c) 2267 return out.getvalue() 2268 2269 if isinstance(src, dict): 2270 return { k: v for k, v in src.items() if pred(k, v) } 2271 return (e for i, e in enumerate(src) if pred(i, e)) 2272 2273 filtered = keep 2274 kept = keep 2275 2276 def last(items: SupportsIndex, fallback: Any = None) -> Any: 2277 return items[-1] if len(items) > 0 else fallback 2278 2279 def links(src: Any) -> Iterable: 2280 'Auto-detect all (HTTP/HTTPS) hyperlink-like substrings.' 2281 2282 if isinstance(src, str): 2283 for match in link_re.finditer(src): 2284 # yield src[match.start():match.end()] 2285 yield match.group(0) 2286 elif isinstance(src, dict): 2287 for k, v in src.items(): 2288 yield from k 2289 yield from links(v) 2290 elif isinstance(src, Iterable): 2291 for v in src: 2292 yield from links(v) 2293 2294 def loopify(x: Callable) -> Callable: 2295 nargs = required_arg_count(x) 2296 if nargs == 2: 2297 return x 2298 elif nargs == 1: 2299 return lambda _, v: x(v) 2300 else: 2301 raise Exception('only funcs with 1 or 2 args are supported') 2302 2303 def mapkv(src: Iterable, key: Callable, value: Callable = identity) -> Dict: 2304 ''' 2305 A map-like func for dictionaries, which uses 2 mapping funcs, the first 2306 for the keys, the second for the values. 2307 ''' 2308 2309 if key is None: 2310 key = lambda k, _: k 2311 2312 if callable(src): 2313 src, key, value = value, src, key 2314 2315 if required_arg_count(key) != 2: 2316 oldkey = key 2317 key = lambda k, _: oldkey(k) 2318 2319 key = loopify(key) 2320 value = loopify(value) 2321 # if isinstance(src, dict): 2322 # return { key(k, v): value(k, v) for k, v in src.items() } 2323 # return { key(i, v): value(i, v) for i, v in enumerate(src) } 2324 2325 def add(k, v, to): 2326 dk = key(k, v) 2327 dv = value(k, v) 2328 if isinstance(dk, Skip) or isinstance(dv, Skip): 2329 return 2330 to[dk] = dv 2331 2332 res = {} 2333 kv = src.items() if isinstance(src, dict) else enumerate(src) 2334 for k, v in kv: 2335 add(k, v, res) 2336 return res 2337 2338 def match(what: str, using: Iterable[str], fallback: str = '') -> str: 2339 'Try to match a string with any of the regexes given.' 2340 2341 if not isinstance(what, str): 2342 what, using = using, what 2343 2344 for s in using: 2345 expr = compile(s) 2346 m = expr.search(what) 2347 if m: 2348 # return what[m.start():m.end()] 2349 return s 2350 return fallback 2351 2352 def maybe(f: Callable, x: Any) -> Any: 2353 ''' 2354 Try calling a func on a value, using the same value as a fallback result, 2355 in case of exceptions. 2356 ''' 2357 2358 if not callable(f): 2359 f, x = x, f 2360 try: 2361 return f(x) 2362 except Exception: 2363 return x 2364 2365 def mappend(*args) -> Dict: 2366 kv = {} 2367 for src in args: 2368 if isinstance(src, dict): 2369 for k, v in src.items(): 2370 kv[k] = v 2371 else: 2372 raise Exception('mappend only works with dictionaries') 2373 return kv 2374 2375 def message(x: Any, result: Any = skip) -> Any: 2376 print(x, file=stderr) 2377 return result 2378 2379 msg = message 2380 2381 def must(cond: Any, errmsg: str = 'condition given not always true') -> None: 2382 'Enforce conditions, raising an exception on failure.' 2383 if not cond: 2384 raise Exception(errmsg) 2385 2386 demand = must 2387 enforce = must 2388 2389 def nowdict() -> dict: 2390 v = datetime(2000, 1, 1).now() 2391 return { 2392 'year': v.year, 2393 'month': v.month, 2394 'day': v.day, 2395 'hour': v.hour, 2396 'minute': v.minute, 2397 'second': v.second, 2398 'text': v.strftime('%Y-%m-%d %H:%M:%S %b %a'), 2399 'weekday': v.strftime('%A'), 2400 } 2401 2402 def number(x: Any) -> Union[int, float, Any]: 2403 ''' 2404 Try to turn the value given into a number, using a fallback value instead 2405 of raising exceptions. 2406 ''' 2407 2408 if isinstance(x, float): 2409 return x 2410 2411 try: 2412 return int(x) 2413 except Exception: 2414 return float(x) 2415 2416 def numbers(src: Any) -> Iterable: 2417 'Auto-detect all number-like substrings.' 2418 2419 if isinstance(src, str): 2420 for match in number_re.finditer(src): 2421 yield match.group(0) 2422 # yield src[match.start():match.end()] 2423 elif isinstance(src, dict): 2424 for k, v in src.items(): 2425 yield from k 2426 yield from links(v) 2427 elif isinstance(src, Iterable): 2428 for v in src: 2429 yield from links(v) 2430 2431 def numsign(x: Union[int, float]) -> Union[int, float]: 2432 'Get a number\'s sign, or NaN if the number given is a NaN.' 2433 2434 if isinstance(x, int): 2435 if x > 0: 2436 return +1 2437 if x < 0: 2438 return -1 2439 return 0 2440 2441 if isnan(x): 2442 return x 2443 2444 if x > 0: 2445 return +1.0 2446 if x < 0: 2447 return -1.0 2448 return 0.0 2449 2450 def numstats(src: Any) -> Dict[str, Union[float, int]]: 2451 'Gather several single-pass numeric statistics.' 2452 2453 n = mean_sq = ln_sum = 0 2454 least = +inf 2455 most = -inf 2456 total = mean = 0 2457 prod = 1 2458 nans = ints = pos = zero = neg = 0 2459 2460 def update_numstats(x: Any) -> None: 2461 nonlocal nans, n, ints, pos, neg, zero, least, most, total, prod 2462 nonlocal ln_sum, mean, mean_sq 2463 2464 if not isinstance(x, (float, int)): 2465 return 2466 2467 if isnan(x): 2468 nans += 1 2469 return 2470 2471 n += 1 2472 ints += int(isinstance(x, int) or x == floor(x)) 2473 2474 if x > 0: 2475 pos += 1 2476 elif x < 0: 2477 neg += 1 2478 else: 2479 zero += 1 2480 2481 least = min(least, x) 2482 most = max(most, x) 2483 2484 # total += x 2485 prod *= x 2486 ln_sum += log(x) 2487 2488 d1 = x - mean 2489 mean += d1 / n 2490 d2 = x - mean 2491 mean_sq += d1 * d2 2492 2493 def _numstats_rec(src: Any) -> None: 2494 if isinstance(src, dict): 2495 for e in src.values(): 2496 _numstats_rec(e) 2497 elif isinstance(src, Iterable) and not isinstance(src, str): 2498 for e in src: 2499 _numstats_rec(e) 2500 else: 2501 update_numstats(src) 2502 2503 _numstats_rec(src) 2504 2505 sd = nan 2506 geomean = nan 2507 if n > 0: 2508 sd = sqrt(mean_sq / n) 2509 geomean = exp(ln_sum / n) if not isinf(ln_sum) else nan 2510 total = n * mean 2511 2512 return { 2513 'n': n, 2514 'nan': nans, 2515 'min': least, 2516 'max': most, 2517 'sum': total, 2518 'mean': mean, 2519 'geomean': geomean, 2520 'sd': sd, 2521 'product': prod, 2522 'integer': ints, 2523 'positive': pos, 2524 'zero': zero, 2525 'negative': neg, 2526 } 2527 2528 def once(x: Any, replacement: Any = None) -> Any: 2529 ''' 2530 Replace the first argument given after the first time this func has been 2531 given it: this is a deliberately stateful function, given its purpose. 2532 ''' 2533 2534 if not (x in seen): 2535 seen.add(x) 2536 return x 2537 else: 2538 return replacement 2539 2540 onced = once 2541 2542 def pad(s: str, n: int, pad: str = ' ') -> str: 2543 l = len(s) 2544 return s if l >= n else s + int((n - l) / len(pad)) * pad 2545 2546 def padcenter(s: str, n: int, pad: str = ' ') -> str: 2547 return s.center(n, pad) 2548 2549 centerpad = padcenter 2550 centerpadded = padcenter 2551 cjust = padcenter 2552 cpad = padcenter 2553 padc = padcenter 2554 paddedcenter = padcenter 2555 2556 def padend(s: str, n: int, pad: str = ' ') -> str: 2557 return s.rjust(n, pad) 2558 2559 padr = padend 2560 padright = padend 2561 paddedend = padend 2562 paddedright = padend 2563 rpad = padend 2564 rightpad = padend 2565 rightpadded = padend 2566 2567 def padstart(s: str, n: int, pad: str = ' ') -> str: 2568 return s.ljust(n, pad) 2569 2570 lpad = padstart 2571 leftpad = padstart 2572 leftpadded = padstart 2573 padl = padstart 2574 padleft = padstart 2575 paddedleft = padstart 2576 paddedstart = padstart 2577 2578 def panic(x: Any) -> None: 2579 raise Exception(x) 2580 2581 def paragraphize(lines: Iterable[str]) -> Iterable[Sequence[str]]: 2582 ''' 2583 Regroup lines into individual paragraphs, each of which is a list of 2584 single-line strings, none of which never end with a trailing line-feed. 2585 ''' 2586 2587 par: List[str] = [] 2588 for l in lines: 2589 if (not l) and par: 2590 yield par 2591 par.clear() 2592 else: 2593 par.append(l) 2594 2595 if len(par) > 0: 2596 yield par 2597 2598 paragraphed = paragraphize 2599 paragraphs = paragraphize 2600 paragroup = paragraphize 2601 pargroup = paragraphize 2602 2603 def parse(s: str, fallback: Any = None) -> Any: 2604 'Try to parse JSON, ignoring exceptions in favor of a fallback value.' 2605 2606 try: 2607 return loads(s) 2608 except Exception: 2609 return fallback 2610 2611 fromjson = parse 2612 parsed = parse 2613 loaded = parse 2614 unjson = parse 2615 2616 def pick(src: Any, *what) -> Any: 2617 'Pick only the keys given from an object, or even a sequence of objects.' 2618 2619 if isinstance(src, dict): 2620 kv = {} 2621 for k in what: 2622 kv[k] = src[k] 2623 return kv 2624 2625 if isinstance(src, Iterable): 2626 return [pick(e, *what) for e in src] 2627 2628 return None 2629 2630 picked = pick 2631 2632 def plain(s: str) -> str: 2633 'Ignore all ANSI-style sequences in a string.' 2634 return ansi_style_re.sub('', s) 2635 2636 def predicate(x: Any) -> Callable: 2637 'Helps various higher-order funcs, by standardizing `predicate` values.' 2638 2639 if callable(x): 2640 return x 2641 2642 if isinstance(x, float): 2643 if isnan(x): 2644 return lambda y: isinstance(y, float) and isnan(y) 2645 if isinf(x): 2646 return lambda y: isinstance(y, float) and isinf(y) 2647 2648 return lambda y: x == y 2649 2650 pred = predicate 2651 2652 def quoted(s: str, quote: str = '"') -> str: 2653 'Surround a string with quotes.' 2654 return f'{quote}{s}{quote}' 2655 2656 def recover(*args) -> Any: 2657 ''' 2658 Catch exceptions using a lambda/callback func, in one of 6 ways 2659 recover(zero_args_func) 2660 recover(zero_args_func, exception_replacement_value) 2661 recover(zero_args_func, one_arg_exception_handling_func) 2662 recover(one_arg_func, arg) 2663 recover(one_arg_func, arg, exception_replacement_value) 2664 recover(one_arg_func, arg, one_arg_exception_handling_func) 2665 ''' 2666 2667 if len(args) == 1: 2668 f = args[0] 2669 try: 2670 return f() 2671 except Exception: 2672 return None 2673 elif len(args) == 2: 2674 f, fallback = args[0], args[1] 2675 if callable(f) and callable(fallback): 2676 try: 2677 return f() 2678 except Exception as e: 2679 nargs = required_arg_count(fallback) 2680 return fallback(e) if nargs == 1 else fallback() 2681 else: 2682 try: 2683 return f() if required_arg_count(f) == 0 else f(args[1]) 2684 except Exception: 2685 return fallback 2686 elif len(args) == 3: 2687 f, x, fallback = args[0], args[1], args[2] 2688 if callable(f) and callable(fallback): 2689 try: 2690 return f(x) 2691 except Exception as e: 2692 nargs = required_arg_count(fallback) 2693 return fallback(e) if nargs == 1 else fallback() 2694 else: 2695 try: 2696 return f(x) 2697 except Exception: 2698 return fallback 2699 else: 2700 raise Exception('recover(...) only works with 1, 2, or 3 args') 2701 2702 attempt = recover 2703 attempted = recover 2704 recovered = recover 2705 recoverred = recover 2706 rescue = recover 2707 rescued = recover 2708 trycall = recover 2709 2710 def reject(src: Iterable, pred: Any) -> Iterable: 2711 ''' 2712 A generalization of built-in func filter, which uses predicate funcs the 2713 opposite way, and which can also handle dicts and strings. 2714 ''' 2715 2716 if callable(src): 2717 src, pred = pred, src 2718 pred = predicate(pred) 2719 pred = loopify(pred) 2720 2721 if isinstance(src, str): 2722 out = StringIO() 2723 for i, c in enumerate(src): 2724 if not pred(i, c): 2725 out.write(c) 2726 return out.getvalue() 2727 2728 if isinstance(src, dict): 2729 return { k: v for k, v in src.items() if not pred(k, v) } 2730 return (e for i, e in enumerate(src) if not pred(i, e)) 2731 2732 avoid = reject 2733 avoided = reject 2734 keepout = reject 2735 keptout = reject 2736 rejected = reject 2737 2738 def retype(x: Any) -> Any: 2739 'Try to narrow the type of the value given.' 2740 2741 if isinstance(x, float): 2742 return int(x) if floor(x) == x else x 2743 2744 if not isinstance(x, str): 2745 return x 2746 2747 try: 2748 return loads(x) 2749 except Exception: 2750 pass 2751 2752 try: 2753 return int(x) 2754 except Exception: 2755 pass 2756 2757 try: 2758 return float(x) 2759 except Exception: 2760 pass 2761 2762 return x 2763 2764 autocast = retype 2765 mold = retype 2766 molded = retype 2767 narrow = retype 2768 narrowed = retype 2769 recast = retype 2770 recasted = retype 2771 remold = retype 2772 remolded = retype 2773 retyped = retype 2774 2775 def revcompose(*what: Callable) -> Callable: 2776 def composite(x: Any) -> Any: 2777 for f in reversed(what): 2778 x = f(x) 2779 return x 2780 return composite 2781 2782 rcompose = revcompose 2783 rcomposed = revcompose 2784 revcomposed = revcompose 2785 2786 def revsort(iterable: Iterable, key: Optional[Callable] = None) -> Iterable: 2787 return sorted(iterable, key=key, reverse=True) 2788 2789 revsorted = revsort 2790 2791 # def revsortkv(src: Dict, key: Callable = None) -> Dict: 2792 # if not key: 2793 # key = lambda kv: (kv[1], kv[0]) 2794 # return sortkv(src, key, reverse=True) 2795 2796 def revsortkv(src: Dict, key: Callable = None) -> Dict: 2797 if key is None: 2798 key = lambda x: x[1] 2799 return sortkv(src, key, reverse=True) 2800 2801 revsortedkv = revsortkv 2802 2803 def rstripdecs(s: str) -> str: 2804 ''' 2805 Ignore trailing zero decimals on number-like strings; even ignore 2806 the decimal dot if trailing. 2807 ''' 2808 2809 try: 2810 f = float(s) 2811 if isnan(f) or isinf(f): 2812 return s 2813 2814 dot = s.find('.') 2815 if dot < 0: 2816 return s 2817 2818 s = s.rstrip('0') 2819 return s[:-1] if s.endswith('.') else s 2820 except Exception: 2821 return s 2822 2823 chopdecs = rstripdecs 2824 2825 def scale(x: float, x0: float, x1: float, y0: float, y1: float) -> float: 2826 'Transform a value from a linear domain into another linear one.' 2827 return (y1 - y0) * (x - x0) / (x1 - x0) + y0 2828 2829 rescale = scale 2830 rescaled = scale 2831 scaled = scale 2832 2833 def shortened(s: str, maxlen: int, trailer: str = '') -> str: 2834 'Limit strings to the symbol-count given, including an optional trailer.' 2835 maxlen = max(maxlen, 0) 2836 return s if len(s) <= maxlen else s[:maxlen - len(trailer)] + trailer 2837 2838 def shuffled(x: Any) -> Any: 2839 'Return a shuffled copy of the list given.' 2840 y = copy(x) 2841 shuffle(y) 2842 return y 2843 2844 def split(src: Union[str, Sequence], n: Union[str, int]) -> Iterable: 2845 'Split/break a string/sequence into several chunks/parts.' 2846 2847 if isinstance(src, str) and isinstance(n, str): 2848 return src.split(n) 2849 if not (isinstance(src, (str, Sequence)) and isinstance(n, int)): 2850 raise Exception('unsupported type-pair of arguments') 2851 2852 if n < 1: 2853 return [] 2854 2855 l = len(src) 2856 if l <= n: 2857 return src.split('') if isinstance(src, str) else src 2858 2859 chunks = [] 2860 csize = int(ceil(l / n)) 2861 while len(src) > 0: 2862 chunks.append(src[:csize]) 2863 src = src[csize:] 2864 return chunks 2865 2866 broken = split 2867 splitted = split 2868 splitten = split 2869 2870 def strdrop(x: str, *what: str) -> str: 2871 'Ignore all occurrences of all substrings given.' 2872 2873 for s in what: 2874 x = x.replace(s, '') 2875 return x 2876 2877 strignore = strdrop 2878 2879 def stringify(x: Any) -> str: 2880 'Fancy alias for func dumps, named after JavaScript\'s func.' 2881 return dumps(x, separators=(', ', ': '), allow_nan=False, indent=None) 2882 2883 jsonate = stringify 2884 jsonify = stringify 2885 tojson = stringify 2886 2887 def strafter(x: str, what: str) -> str: 2888 i = x.find(what) 2889 return '' if i < 0 else x[i+len(what):] 2890 2891 def strafterlast(x: str, what: str) -> str: 2892 i = x.rfind(what) 2893 return '' if i < 0 else x[i+len(what):] 2894 2895 def strbefore(x: str, what: str) -> str: 2896 i = x.find(what) 2897 return x if i < 0 else x[:i] 2898 2899 def strbeforelast(x: str, what: str) -> str: 2900 i = x.rfind(what) 2901 return x if i < 0 else x[:i] 2902 2903 def strsince(x: str, what: str) -> str: 2904 i = x.find(what) 2905 return '' if i < 0 else x[i:] 2906 2907 def strsincelast(x: str, what: str) -> str: 2908 i = x.rfind(what) 2909 return '' if i < 0 else x[i:] 2910 2911 def struntil(x: str, what: str) -> str: 2912 i = x.find(what) 2913 return x if i < 0 else x[:i+len(what)] 2914 2915 def struntillast(x: str, what: str) -> str: 2916 i = x.rfind(what) 2917 return x if i < 0 else x[:i+len(what)] 2918 2919 struntilfinal = struntillast 2920 2921 def since(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 2922 'Start strings/sequences with a substring/value\'s appearance.' 2923 return (strsince if isinstance(x, str) else itemssince)(x, what) 2924 2925 def sincelast(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 2926 'Start strings/sequences with a substring/value\'s last appearance.' 2927 return (strsincelast if isinstance(x, str) else itemssincelast)(x, what) 2928 2929 sincefinal = sincelast 2930 2931 def sortk(x: Dict, key: Callable = identity, reverse: bool = False) -> Dict: 2932 keys = sorted(x.keys(), key=key, reverse=reverse) 2933 return {k: x[k] for k in keys} 2934 2935 sortkeys = sortk 2936 sortedkeys = sortk 2937 2938 def sortkv(src: Dict, key: Callable = None, reverse: bool = False) -> Dict: 2939 if key is None: 2940 key = lambda x: x[1] 2941 kv = sorted(src.items(), key=key, reverse=reverse) 2942 return {k: v for (k, v) in kv} 2943 2944 sortedkv = sortkv 2945 2946 def squeeze(s: str) -> str: 2947 ''' 2948 A more aggressive way to rid strings of extra spaces which, unlike string 2949 method strip, also turns inner runs of multiple spaces into single ones. 2950 ''' 2951 s = s.strip() 2952 s = spaces_re.sub(' ', s) 2953 s = paddable_tab_re.sub('\t', s) 2954 return s 2955 2956 squeezed = squeeze 2957 2958 def stround(x: Union[int, float], decimals: int = 6) -> str: 2959 'Format numbers into a string with the given decimal-digit count.' 2960 2961 if decimals >= 0: 2962 return f'{x:.{decimals}f}' 2963 else: 2964 return f'{round(x, decimals):.0f}' 2965 2966 def tally(src: Iterable, by: Callable = identity) -> Dict[Any, int]: 2967 ''' 2968 Count all distinct (transformed) values, the result being a dictionary 2969 whose keys are all the transformed values, and whose items are positive 2970 integers. 2971 ''' 2972 2973 if callable(src): 2974 src, by = by, src 2975 2976 tally: Dict[Any, int] = {} 2977 by = loopify(by) 2978 2979 if isinstance(src, dict): 2980 for k, v in src.items(): 2981 dk = by(k, v) 2982 if dk in tally: 2983 tally[dk] += 1 2984 else: 2985 tally[dk] = 1 2986 else: 2987 for i, v in enumerate(src): 2988 dk = by(i, v) 2989 if dk in tally: 2990 tally[dk] += 1 2991 else: 2992 tally[dk] = 1 2993 return tally 2994 2995 tallied = tally 2996 2997 def transpose(src: Any) -> Any: 2998 'Turn lists/objects inside-out like socks, so to speak.' 2999 3000 if isinstance(src, dict): 3001 return { v: k for k, v in src.items() } 3002 3003 if not arrayish(src): 3004 msg = 'transpose only supports objects or iterables of objects' 3005 raise ValueError(msg) 3006 3007 kv: Dict[Any, Any] = {} 3008 seq: List[Any] = [] 3009 3010 for e in src: 3011 if isinstance(e, dict): 3012 for k, v in e.items(): 3013 if k in kv: 3014 kv[k].append(v) 3015 else: 3016 kv[k] = [v] 3017 elif isinstance(e, Iterable): 3018 for i, v in enumerate(e): 3019 if i < len(seq): 3020 seq[i].append(v) 3021 else: 3022 seq.append([v]) 3023 else: 3024 msg = 'transpose(...): not all items are iterables/objects' 3025 raise ValueError(msg) 3026 3027 if len(kv) > 0 and len(seq) > 0: 3028 msg = 'transpose(...): mix of iterables and objects not supported' 3029 raise ValueError(msg) 3030 return kv if len(seq) == 0 else seq 3031 3032 tr = transpose 3033 transp = transpose 3034 transposed = transpose 3035 3036 def trap(x: Callable, y: Union[Callable[[Exception], Any], Any] = None) -> Any: 3037 'Try running a func, handing exceptions over to a fallback func.' 3038 3039 try: 3040 return x() if callable(x) else x 3041 except Exception as e: 3042 if callable(y): 3043 nargs = required_arg_count(y) 3044 return y(e) if nargs == 1 else y() 3045 else: 3046 return y 3047 3048 catch = trap 3049 catched = trap 3050 caught = trap 3051 noerr = trap 3052 noerror = trap 3053 noerrors = trap 3054 safe = trap 3055 save = trap 3056 saved = trap 3057 trapped = trap 3058 3059 def tsv(x: str, fn: Union[Callable, None] = None) -> Any: 3060 if fn is None: 3061 return x.split('\t') 3062 if callable(x): 3063 x, fn = fn, x 3064 return fn(x.split('\t')) 3065 3066 def typename(x: Any) -> str: 3067 if x is None: 3068 return 'null' 3069 if isinstance(x, bool): 3070 return 'boolean' 3071 if isinstance(x, str): 3072 return 'string' 3073 if isinstance(x, (int, float)): 3074 return 'number' 3075 if isinstance(x, (list, tuple)): 3076 return 'array' 3077 if isinstance(x, dict): 3078 return 'object' 3079 return type(x).__name__ 3080 3081 def typeof(x: Any) -> str: 3082 'Get a value\'s JS-like typeof type-string.' 3083 3084 if callable(x): 3085 return 'function' 3086 3087 return { 3088 bool: 'boolean', 3089 int: 'number', 3090 float: 'number', 3091 str: 'string', 3092 }.get(type(x), 'object') 3093 3094 def unixify(s: str) -> str: 3095 ''' 3096 Make plain-text `unix-style`, ignoring a leading UTF-8 BOM if present, 3097 and turning any/all CRLF byte-pairs into line-feed bytes. 3098 ''' 3099 s = s.lstrip('\xef\xbb\xbf') 3100 return s.replace('\r\n', '\n') if '\r\n' in s else s 3101 3102 def unquoted(s: str) -> str: 3103 'Ignore surrounding quotes in a string.' 3104 3105 if s.startswith('"') and s.endswith('"'): 3106 return s[1:-1] 3107 if s.startswith('\'') and s.endswith('\''): 3108 return s[1:-1] 3109 if s.startswith('`') and s.endswith('`'): 3110 return s[1:-1] 3111 if s.startswith('”') and s.endswith('“'): 3112 return s[1:-1] 3113 if s.startswith('“') and s.endswith('”'): 3114 return s[1:-1] 3115 return s 3116 3117 dequote = unquoted 3118 dequoted = unquoted 3119 3120 def until(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 3121 'End strings/sequences with a substring/value\'s appearance.' 3122 return (struntil if isinstance(x, str) else itemsuntil)(x, what) 3123 3124 def untillast(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 3125 'End strings/sequences with a substring/value\'s last appearance.' 3126 return (struntillast if isinstance(x, str) else itemsuntillast)(x, what) 3127 3128 untilfinal = untillast 3129 3130 3131 def wait(seconds: Union[int, float], result: Any) -> Any: 3132 'Wait the given number of seconds, before returning its latter arg.' 3133 3134 t = (int, float) 3135 if (not isinstance(seconds, t)) and isinstance(result, t): 3136 seconds, result = result, seconds 3137 sleep(seconds) 3138 return result 3139 3140 delay = wait 3141 3142 def wat(*args) -> None: 3143 'What Are These (wat) shows help/doc messages for funcs given to it.' 3144 3145 from pydoc import doc 3146 3147 c = 0 3148 w = stderr 3149 3150 for e in args: 3151 if not callable(e): 3152 continue 3153 3154 if c > 0: 3155 print(file=w) 3156 3157 print(f'\x1b[48;5;253m\x1b[38;5;26m{e.__name__:80}\x1b[0m', file=w) 3158 doc(e, output=w) 3159 c += 1 3160 3161 return Skip() 3162 3163 def wit(*args) -> None: 3164 'What Is This (wit) shows help/doc messages for funcs given to it.' 3165 return wat(*args) 3166 3167 def zoom(x: Any, *keys_indices) -> Any: 3168 for k in keys_indices: 3169 # allow int-indexing dicts the same way lists/tuples can be 3170 if isinstance(x, dict) and isinstance(k, int): 3171 l = len(x) 3172 if i < 0: 3173 i += l 3174 if i < 0 or i >= len(x): 3175 x = None 3176 continue 3177 for i, e in enumerate(x.values()): 3178 if i == k: 3179 x = e 3180 break 3181 continue 3182 3183 # regular key/index access for dicts/lists/tuples 3184 x = x[k] 3185 3186 return x 3187 3188 3189 # args is the `proper` list of arguments given to the script 3190 args = argv[1:] 3191 run_mode = '' 3192 trace_exceptions = False 3193 profile_run = False 3194 3195 if len(args) == 0: 3196 # show help message when given no arguments 3197 print(info.strip(), file=stderr) 3198 exit(0) 3199 3200 trace_opts = ( 3201 '-t', '--t', '-trace', '--trace', '-traceback', '--traceback', 3202 ) 3203 profile_opts = ('-p', '--p', '-prof', '--prof', '-profile', '--profile') 3204 3205 # handle all other leading options; the explicit help options are 3206 # handled earlier in the script 3207 while len(args) > 0: 3208 if args[0] in trace_opts: 3209 trace_exceptions = True 3210 args = args[1:] 3211 continue 3212 3213 if args[0] in profile_opts: 3214 profile_run = True 3215 args = args[1:] 3216 continue 3217 3218 s = opts2modes.get(args[0], '') 3219 if not s: 3220 break 3221 3222 run_mode = s 3223 args = args[1:] 3224 3225 inputs = [] 3226 expression = '' 3227 if len(args) > 0: 3228 expression = args[0] 3229 inputs = args[1:] 3230 3231 if not run_mode: 3232 run_mode = 'each-line' 3233 3234 if not expression and not (run_mode in ('json-lines', 'each-line')): 3235 # show help message when given no expression 3236 print(info.strip(), file=stderr) 3237 exit(0) 3238 3239 glo = globals() 3240 for e in (physics, symbols, units): 3241 for k, v in e.__dict__.items(): 3242 if not k in glo: 3243 glo[k] = v 3244 3245 exec = disabled_exec 3246 3247 try: 3248 # compile the expression to speed it up, since they're all (re)run 3249 # for each line from standard input; also, handle a single-dot as 3250 # an identity expression, using the current line as is 3251 if expression in ('', '.'): 3252 expression = { 3253 'all-lines': 'lines', 3254 'all-bytes': 'data', 3255 'each-block': 'block', 3256 'each-line': 'line', 3257 'json-lines': 'data', 3258 'no-input': 'info.strip()', 3259 'whole-strings': 'value', 3260 }[run_mode] 3261 expression = compile_py(expression, expression, 'eval') 3262 3263 # `comprehension` expressions seem to ignore local variables: even 3264 # lambda-based workarounds fail 3265 i = 0 3266 c = 1 3267 nr = 1 3268 _ = None 3269 3270 fn = { 3271 'each-line': stop_normal, 3272 'each-block': stop_normal, 3273 'all-lines': stop_normal, 3274 'all-bytes': stop_normal, 3275 'json-lines': stop_json, 3276 'no-input': stop_normal, 3277 'whole-strings': stop_normal, 3278 }[run_mode] 3279 glo['halt'] = fn 3280 glo['stop'] = fn 3281 3282 fn = { 3283 'each-line': main_each_line, 3284 'each-block': main_each_block, 3285 'all-lines': main_all_lines, 3286 'all-bytes': main_all_bytes, 3287 'json-lines': main_json_lines, 3288 'no-input': main_no_input, 3289 'whole-strings': main_whole_strings, 3290 }[run_mode] 3291 3292 if fn is None: 3293 raise Exception(f'internal error: invalid run-mode {run_mode}') 3294 3295 if profile_run: 3296 from cProfile import Profile 3297 # using a profiler in a `with` context adds many irrelevant 3298 # entries to its output 3299 prof = Profile() 3300 prof.enable() 3301 fn(stdout, stdin, expression, inputs) 3302 prof.disable() 3303 prof.print_stats() 3304 else: 3305 fn(stdout, stdin, expression, inputs) 3306 except BrokenPipeError: 3307 # quit quietly, instead of showing a confusing error message 3308 stderr.close() 3309 except KeyboardInterrupt: 3310 exit(2) 3311 except Exception as e: 3312 if trace_exceptions: 3313 raise e 3314 s = str(e) 3315 s = s if s else '<generic exception>' 3316 print(f'\x1b[31m{s}\x1b[0m', file=stderr) 3317 exit(1)