File: tl.py 1 #!/usr/bin/python3 2 3 # The MIT License (MIT) 4 # 5 # Copyright © 2024 pacman64 6 # 7 # Permission is hereby granted, free of charge, to any person obtaining a copy 8 # of this software and associated documentation files (the “Software”), to deal 9 # in the Software without restriction, including without limitation the rights 10 # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 11 # copies of the Software, and to permit persons to whom the Software is 12 # furnished to do so, subject to the following conditions: 13 # 14 # The above copyright notice and this permission notice shall be included in 15 # all copies or substantial portions of the Software. 16 # 17 # THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 18 # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 19 # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 20 # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 21 # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 22 # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 23 # SOFTWARE. 24 25 26 info = ''' 27 tl [options...] [python expression] [filepaths/URIs...] 28 29 30 Transform Lines runs a Python expression on each line of plain-text data: 31 each expression given emits its result as its own line. Each input line is 32 available to the expression as either `line`, or `l`. Lines are always 33 stripped of any trailing end-of-line bytes/sequences. 34 35 When the expression results in non-string iterable values, a sort of input 36 `amplification` happens for the current input-line, where each item from 37 the result is emitted on its own output line. Dictionaries emit their data 38 as a single JSON line. 39 40 When a formula's result is the None value, it emits no output line, which 41 filters-out the current line, the same way empty-iterable results do. 42 43 When in `all` mode, all input lines are read first into a list of strings, 44 whose items are all stripped of any end-of-line sequences, and kept in the 45 `lines` global variable: the expression given is then run only once. 46 47 Similarly, if the argument before the expression is a single equals sign 48 (a `=`, but without the quotes), no data are read/loaded: the expression is 49 then run only once, effectively acting as a `pure` plain-text generator. 50 51 Current-input names, depending on mode: 52 53 names mode evaluation 54 55 l, line each-line (default) for each input line 56 lines all-lines once, after last input line 57 b, block, p, par block/paragraph for each paragraph of lines 58 v, value jsonl for each input line 59 (no name) no-input once, without an input value 60 61 Modes `each-line` (the default) and `block/paragraph` also define `i` as an 62 integer which starts as 0, and which is incremented after each evaluation. 63 64 Options, where leading double-dashes are also allowed, except for alias `=`: 65 66 -a read all lines at once into a string-list called `lines` 67 -all same as -a 68 -lines same as -a 69 70 -b read uninterrupted blocks/groups of lines as paragraphs 71 -blocks same as -b 72 -g same as -b 73 -groups same as -b 74 -p same as -b 75 -par same as -b 76 -para same as -b 77 -paragraphs same as -b 78 79 -h show this help message 80 -help same as -h 81 82 -jsonl transform JSON Lines into proper JSON 83 84 -nil don't read any input, and run the expression only once 85 -no-input same as -nil 86 -noinput same as -nil 87 -none same as -nil 88 -null same as -nil 89 -null-input same as -nil 90 -nullinput same as -nil 91 = same as -nil 92 93 -p show a performance/time-profile of the full `task` run 94 -prof same as -p 95 -profile same as -p 96 97 -s read each input as a whole string 98 -str same as -s 99 -string same as -s 100 -w same as -s 101 -whole same as -s 102 103 -t show a full traceback of this script for exceptions 104 -trace same as -t 105 -traceback same as -t 106 107 108 Extra Functions 109 110 blue(s) color strings blue, using surrounding ANSI-style sequences 111 gray(s) color strings gray, using surrounding ANSI-style sequences 112 green(s) color strings green, using surrounding ANSI-style sequences 113 highlight(s) highlight strings, using surrounding ANSI-style sequences 114 hilite(s) same as func highlight 115 orange(s) color strings orange, using surrounding ANSI-style sequences 116 purple(s) color strings purple, using surrounding ANSI-style sequences 117 red(s) color strings red, using surrounding ANSI-style sequences 118 119 realign(x, gap=2) pad items across lines, so that all "columns" align 120 121 after(x, y) ignore items until the one given; for strings and sequences 122 afterfinal(x, y) backward counterpart of func after 123 afterlast(x, y) same as func afterfinal 124 arrayish(x) check if value is a list, a tuple, or a generator 125 basename(s) get the final/file part of a pathname 126 before(x, y) ignore items since the one given; for strings and sequences 127 beforefinal(x, y) backward counterpart of func before 128 beforelast(x, y) same as func beforefinal 129 chunk(x, size) split/resequence items into chunks of the length given 130 chunked(x, size) same as func chunk 131 compose(*args) make a func which chain-calls all funcs given 132 composed(*args) same as func compose 133 cond(*args) expression-friendly fully-evaluated if-else chain 134 debase64(s) decode base64 strings, including data-URIs 135 dedup(x) ignore later (re)occurrences of values in a sequence 136 dejson(x, f=None) safe parse JSON from strings 137 denan(x, y) turn a floating-point NaN values into the fallback given 138 denil(*args) return the first non-null/none value among those given 139 denone(*args) same as func denil 140 denull(*args) same as func denil 141 dirname(s) get the folder/directory/parent part of a pathname 142 dive(x, f) transform value in depth-first-recursive fashion 143 divebin(x, y, f) binary (2-input) version of recursive-transform func dive 144 drop(x, *what) ignore keys or substrings; for strings, dicts, dict-lists 145 dropped(x, *v) same as func drop 146 each(x, f) generalization of built-in func map 147 endict(x) turn non-dictionary values into dicts with string keys 148 enfloat(x, f=nan) turn values into floats, offering a fallback on failure 149 enint(x, f=None) turn values into ints, offering a fallback on failure 150 enlist(x) turn non-list values into lists 151 entuple(x) turn non-tuple values into tuples 152 ext(s) return the file-extension part of a pathname, if available 153 fields(s) split fields AWK-style from the string given 154 filtered(x, f) same as func keep 155 flat(*args) flatten everything into an unnested sequence 156 fromto(x, y, ?f) sequence integers, end-value included 157 group(x, ?by) group values into dicts of lists; optional transform func 158 grouped(x, ?by) same as func group 159 harden(f, v) make funcs which return values instead of exceptions 160 hardened(f, v) same as func harden 161 countif(x, f) count how many values make the func given true-like 162 idiota(x, ?f) dict-counterpart of func iota 163 ints(x, y, ?f) make sequences of increasing integers, which include the end 164 iota(x, ?f) make an integer sequence from 1 up to the number given 165 join(x, y) join values into a string; make a dict from keys and values 166 json0(x) turn a value into its smallest JSON-string representation 167 json2(x) turn a value into a 2-space-indented multi-line JSON string 168 jsonl(x) turn a value into a sequence of single-line (JSONL) strings 169 keep(x, pred) generalization of built-in func filter 170 kept(x, pred) same as func keep 171 links(x) auto-detect all hyperlink-like (HTTP/HTTPS) substrings 172 mapped(x, f) same as func each 173 number(x) try to parse as an int, on failure try to parse as a float 174 numbers(x) auto-detect all numbers in the value given 175 numstats(x) calculate various `single-pass` numeric stats 176 once(x, y=None) avoid returning the same value more than once; stateful func 177 pick(x, *what) keep only the keys given; works on dicts, or dict-sequences 178 picked(x, *what) same a func pick 179 plain(s) ignore ANSI-style sequences in strings 180 quoted(s, q='"') surround a string with the (optional) quoting-symbol given 181 recover(*args) recover from exceptions with a fallback value 182 reject(x, pred) generalization of built-in func filter, with opposite logic 183 since(x, y) ignore items before the one given; for strings and sequences 184 sincefinal(x, y) backward counterpart of func since 185 sincelast(x, y) same as func sincefinal 186 split(x, y) split string by separator; split sequence into several ones 187 squeeze(s) strip/trim a string, squishing inner runs of spaces 188 stround(x, d=6) format numbers into decimal-number strings 189 tally(x, ?by) count/tally values, using an optional transformation func 190 tallied(x, ?by) same as func tally 191 trap(x, f=None) try running a func, handing exceptions to a fallback func 192 trycall(*args) same as func recover 193 unique(x) same as func dedup 194 uniqued(x) same as func dedup 195 unjson(x, f=None) same as func dejson 196 unquoted(s) ignore surrounding quotes, if present 197 until(x, y) ignore items after the one given; for strings and sequences 198 untilfinal(x, y) backward counterpart of func until 199 untillast(x, y) same as func untilfinal 200 wait(seconds, x) wait the given number of seconds, before returning a value 201 wat(*args) What Are These (wat) shows help/doc messages for funcs 202 203 204 Examples 205 206 # numbers from 0 to 5, each on its own output line; no input is read/used 207 tl = 'range(6)' 208 209 # all powers up to the 4th, using each input line auto-parsed into a `float` 210 tl = 'range(1, 6)' | tl '(float(l)**p for p in range(1, 4+1))' 211 212 # separate input lines with an empty line between each; global var `empty` 213 # can be used to avoid bothering with nested shell-quoting 214 tl = 'range(6)' | tl '["", l] if i > 0 else l' 215 216 # keep only the last 2 lines from the input 217 tl = 'range(1, 6)' | tl -all 'lines[-2:]' 218 219 # join input lines into tab-separated lines of up to 3 items each; global 220 # var named `tab` can be used to avoid bothering with nested shell-quoting 221 tl = 'range(1, 8)' | tl -all '("\\t".join(c) for c in chunk(lines, 3))' 222 223 # ignore all lines before the first one with just a '5' in it 224 tl = 'range(8)' | tl -all 'since(lines, "5")' 225 226 # ignore errors/exceptions, in favor of the original lines/values 227 tl = '("abc", "123")' | tl 'safe(lambda: 2 * float(line), line)' 228 229 # ignore errors/exceptions, calling a fallback func with the exception 230 tl = '("abc", "123")' | tl 'safe(lambda: 2 * float(line), lambda e: str(e))' 231 232 # filtering lines out via None values 233 head -c 1024 /dev/urandom | strings | tl 'l if len(l) < 20 else None' 234 235 # boolean-valued results are concise ways to filter lines out 236 head -c 1024 /dev/urandom | strings | tl 'len(l) < 20' 237 238 # function/callable results are automatically called on the current line 239 head -c 1024 /dev/urandom | strings | tl len 240 ''' 241 242 243 from sys import argv, exit, stderr, stdin, stdout 244 245 246 if __name__ != '__main__': 247 print('don\'t import this script, run it directly instead', file=stderr) 248 exit(1) 249 250 # no args or a leading help-option arg means show the help message and quit 251 if len(argv) < 2 or argv[1] in ('-h', '--h', '-help', '--help'): 252 from sys import exit, stderr 253 print(info.strip(), file=stderr) 254 exit(0) 255 256 257 from io import StringIO, TextIOWrapper 258 259 from typing import \ 260 AbstractSet, Annotated, Any, AnyStr, \ 261 AsyncContextManager, AsyncGenerator, AsyncIterable, AsyncIterator, \ 262 Awaitable, BinaryIO, ByteString, Callable, cast, \ 263 ClassVar, Collection, Container, \ 264 ContextManager, Coroutine, Deque, Dict, Final, \ 265 final, ForwardRef, FrozenSet, Generator, Generic, get_args, get_origin, \ 266 get_type_hints, Hashable, IO, ItemsView, \ 267 Iterable, Iterator, KeysView, List, Literal, Mapping, \ 268 MappingView, Match, MutableMapping, MutableSequence, MutableSet, \ 269 NamedTuple, NewType, no_type_check, no_type_check_decorator, \ 270 NoReturn, Optional, overload, \ 271 Protocol, Reversible, \ 272 runtime_checkable, Sequence, Set, Sized, SupportsAbs, \ 273 SupportsBytes, SupportsComplex, SupportsFloat, SupportsIndex, \ 274 SupportsInt, SupportsRound, Text, TextIO, Tuple, Type, \ 275 TypedDict, TypeVar, \ 276 TYPE_CHECKING, Union, ValuesView 277 try: 278 from typing import \ 279 assert_never, assert_type, clear_overloads, Concatenate, \ 280 dataclass_transform, get_overloads, is_typeddict, LiteralString, \ 281 Never, NotRequired, ParamSpec, ParamSpecArgs, ParamSpecKwargs, \ 282 Required, reveal_type, Self, TypeAlias, TypeGuard, TypeVarTuple, \ 283 Unpack 284 from typing import \ 285 AwaitableGenerator, override, TypeAliasType, type_check_only 286 except Exception: 287 pass 288 289 290 def conforms(x: Any) -> bool: 291 ''' 292 Check if a value is JSON-compatible, which includes checking values 293 recursively, in case of composite/nestable values. 294 ''' 295 296 if x is None or isinstance(x, (bool, int, str)): 297 return True 298 if isinstance(x, float): 299 return not (isnan(x) or isinf(x)) 300 if isinstance(x, (list, tuple)): 301 return all(conforms(e) for e in x) 302 if isinstance(x, dict): 303 return all(conforms(k) and conforms(v) for k, v in x.items()) 304 return False 305 306 307 def seems_url(s: str) -> bool: 308 protocols = ('https://', 'http://', 'file://', 'ftp://', 'data:') 309 return any(s.startswith(p) for p in protocols) 310 311 312 def disabled_exec(*args, **kwargs) -> None: 313 _ = args 314 _ = kwargs 315 raise Exception('built-in func `exec` is disabled') 316 317 318 def fix_value(x: Any, default: Any) -> Any: 319 'Adapt a value so it can be output.' 320 321 # true shows the current line as the current output; presumably 322 # this is the result of calling a `condition-like` expression 323 if x is True: 324 return default 325 326 # null and false show no output for the current input line 327 if x is False: 328 return None 329 330 if x is type: 331 return type(default).__name__ 332 333 # if expression results in a func, auto-call it with the original data 334 if callable(x) and not isinstance(x, Iterable): 335 c = required_arg_count(x) 336 if c == 1: 337 x = x(default) 338 else: 339 m = f'func auto-call only works with 1-arg funcs (func wanted {c})' 340 raise Exception(m) 341 342 if x is None or isinstance(x, (bool, int, float, str)): 343 return x 344 345 rec = fix_value 346 347 if isinstance(x, dict): 348 return { 349 rec(k, default): rec(v, default) for k, v in x.items() if not 350 (isinstance(k, Skip) or isinstance(v, Skip)) 351 } 352 if isinstance(x, Iterable): 353 return tuple(rec(e, default) for e in x if not isinstance(e, Skip)) 354 355 if isinstance(x, Dottable): 356 return rec(x.__dict__, default) 357 if isinstance(x, DotCallable): 358 return rec(x.value, default) 359 360 if isinstance(x, Exception): 361 raise x 362 363 return None if isinstance(x, Skip) else str(x) 364 365 366 def show_value(w, x: Any) -> None: 367 'Helper func used by func show_result.' 368 369 # null shows no output for the current input line 370 if x is None or isinstance(x, Skip): 371 return 372 373 if isinstance(x, dict): 374 dump(x, w, separators=(', ', ': '), allow_nan=False, indent=None) 375 w.write('\n') 376 w.flush() 377 elif isinstance(x, (bytes, bytearray)): 378 w.write(x) 379 w.flush() 380 elif isinstance(x, Iterable) and not isinstance(x, str): 381 dump(x, w, separators=(', ', ': '), allow_nan=False, indent=None) 382 w.write('\n') 383 w.flush() 384 elif isinstance(x, DotCallable): 385 print(x.value, file=w, flush=True) 386 else: 387 print(x, file=w, flush=True) 388 389 390 def show_result(w, x: Any) -> None: 391 if isinstance(x, (dict, str)): 392 show_value(w, x) 393 elif isinstance(x, Iterable): 394 for e in x: 395 if isinstance(e, Exception): 396 raise e 397 show_value(w, e) 398 else: 399 show_value(w, x) 400 401 402 def make_open_utf8(open: Callable) -> Callable: 403 'Restrict the file-open func to a read-only utf-8 file-open func.' 404 def open_utf8_readonly(name: str): 405 'A UTF-8 read-only file-open func overriding the built-in open func.' 406 return open(name, encoding='utf-8') 407 return open_utf8_readonly 408 409 open_utf8 = make_open_utf8(open) 410 open = open_utf8 411 412 413 def loop_lines_inputs(r, inputs: List[str], doing: Callable) -> None: 414 ''' 415 Act on multiple named inputs line-by-line, via the func given; when 416 not given any named inputs, the default reader given is used instead. 417 ''' 418 419 main_input: List[str] = [] 420 got_main_input = False 421 dashes = inputs.count('-') 422 423 if any(seems_url(e) for e in inputs): 424 from urllib.request import urlopen 425 426 def _adapt_lines(src) -> Iterable[str]: 427 for j, line in enumerate(src): 428 if j == 0: 429 line = line.lstrip('\xef\xbb\xbf') 430 yield line.rstrip('\r\n').rstrip('\n') 431 432 def _hold_adapt_lines(src) -> Iterable[str]: 433 for e in _adapt_lines(src): 434 main_input.append(e) 435 yield e 436 437 for path in inputs: 438 if path == '-': 439 if dashes == 1: 440 doing(_adapt_lines(r)) 441 continue 442 443 if not got_main_input: 444 doing(_hold_adapt_lines(r)) 445 got_main_input = True 446 else: 447 doing(_adapt_lines(main_input)) 448 continue 449 450 if seems_url(path): 451 with urlopen(path) as inp: 452 with TextIOWrapper(inp, encoding='utf-8') as txt: 453 doing(_adapt_lines(txt)) 454 continue 455 456 with open_utf8(path) as inp: 457 doing(_adapt_lines(inp)) 458 459 if len(inputs) == 0: 460 doing(_adapt_lines(r)) 461 462 463 def loop_whole_inputs(r, inputs: List[str], doing: Callable) -> None: 464 ''' 465 Act on multiple named inputs, read as whole strings, via the func given; 466 when not given any named inputs, the default reader given is used instead. 467 ''' 468 469 main_input: List[str] = [] 470 got_main_input = False 471 dashes = inputs.count('-') 472 473 if any(seems_url(e) for e in inputs): 474 from urllib.request import urlopen 475 476 for path in inputs: 477 if path == '-': 478 if dashes == 1: 479 doing(r.read()) 480 continue 481 482 if not got_main_input: 483 main_input = r.read() 484 got_main_input = True 485 doing(main_input) 486 continue 487 488 if seems_url(path): 489 with urlopen(path) as inp: 490 with TextIOWrapper(inp, encoding='utf-8') as txt: 491 doing(txt.read()) 492 continue 493 494 with open_utf8(path) as inp: 495 doing(inp.read()) 496 497 if len(inputs) == 0: 498 doing(r.read()) 499 500 501 def main_whole_strings(out, r, expression, inputs) -> None: 502 def _each_string(out, src, expression: Any) -> None: 503 # `comprehension` expressions seem to ignore local variables: even 504 # lambda-based workarounds fail 505 global s, t, text, v, value, w, whole, _ 506 507 s = t = text = v = value = w = whole = src 508 res = eval(expression) 509 res = fix_value(res, src) 510 show_result(out, res) 511 _ = res 512 513 loop_whole_inputs(r, inputs, lambda s: _each_string(out, s, expression)) 514 515 516 def main_each_line(w, r, expression, inputs) -> None: 517 def _each_line(w, src, expression: Any) -> None: 518 # `comprehension` expressions seem to ignore local variables: even 519 # lambda-based workarounds fail 520 global i, nr, previous, prev, line, l, line, _ 521 522 previous = '' 523 prev = previous 524 525 for line in src: 526 l = line 527 res = eval(expression) 528 res = fix_value(res, line) 529 show_result(w, res) 530 i += 1 531 nr += 1 532 previous = line 533 prev = previous 534 _ = res 535 536 loop_lines_inputs(r, inputs, lambda r: _each_line(w, r, expression)) 537 538 539 def main_each_block(w, r, expression, inputs) -> None: 540 def _each_block(w, r, expression) -> None: 541 # `comprehension` expressions seem to ignore local variables: even 542 # lambda-based workarounds fail 543 global i, nr 544 global previous, prev, lines, block, par, para, paragraph, _ 545 546 for item in paragraphize(r): 547 lines = block = par = para = paragraph = item 548 res = eval(expression) 549 if isinstance(res, Skip): 550 previous = data 551 prev = previous 552 i += 1 553 nr += 1 554 continue 555 556 res = fix_value(res, lines) 557 show_result(w, res) 558 i += 1 559 nr += 1 560 prev = previous = lines 561 _ = res 562 563 loop_lines_inputs(r, inputs, lambda r: _each_block(w, r, expression)) 564 565 566 def main_all_lines(w, r, expression, inputs) -> None: 567 # `comprehension` expressions seem to ignore local variables: even 568 # lambda-based workarounds fail 569 global line, lines, data, values, d, l, v, dat, val 570 571 def _all_lines(w, r, expression) -> None: 572 # `comprehension` expressions seem to ignore local variables: even 573 # lambda-based workarounds fail 574 global lines, line, l 575 576 for e in r: 577 line = l = e 578 lines.append(line) 579 580 lines = [] 581 line = l = '' 582 loop_lines_inputs(r, inputs, lambda r: _all_lines(w, r, expression)) 583 data = values = d = v = dat = val = lines 584 res = eval(expression) 585 res = fix_value(res, lines) 586 show_result(w, res) 587 588 589 def main_all_bytes(w, r, expression, inputs) -> None: 590 # `comprehension` expressions seem to ignore local variables: even 591 # lambda-based workarounds fail 592 global data, values, d, v, dat, val 593 data = values = d = v = dat = val = r.buffer.read() 594 res = eval(expression) 595 res = fix_value(res, data) 596 show_result(w, res) 597 598 599 def main_no_input(w, r, expression, inputs) -> None: 600 res = eval(expression) 601 fix = lambda x: fix_value(x, None) 602 f = str if res is None or isinstance(res, bool) else fix 603 res = f(res) 604 show_result(w, res) 605 606 607 def main_json_lines(w, r, expression, inputs) -> None: 608 def _jsonl2json(w, src, expression: Any) -> None: 609 # `comprehension` expressions seem to ignore local variables: even 610 # lambda-based workarounds fail 611 global i, nr 612 global line, l, data, d, value, v, dat, val, prev, previous, _ 613 614 previous = None 615 prev = previous 616 617 for line in src: 618 if emptyish_re.match(line) or commented_re.match(line): 619 continue 620 l = line 621 622 data = value = d = v = dat = val = loads(line) 623 res = eval(expression) 624 625 if isinstance(res, Skip): 626 previous = data 627 prev = previous 628 i += 1 629 nr += 1 630 continue 631 res = fix_value(res, data) 632 633 if callable(res): 634 res = res(data) 635 if not conforms(res): 636 res = conform(res) 637 dump(res, w) 638 _ = res 639 w.write('\n') 640 641 previous = data 642 prev = previous 643 i += 1 644 nr += 1 645 646 loop_lines_inputs(r, inputs, lambda r: _jsonl2json(w, r, expression)) 647 648 649 # opts2modes simplifies option-handling in func main 650 opts2modes = { 651 '=': 'no-input', 652 '-nil': 'no-input', 653 '-no-input': 'no-input', 654 '-noinput': 'no-input', 655 '-none': 'no-input', 656 '-None': 'no-input', 657 '-null': 'no-input', 658 '-null-input': 'no-input', 659 '-nullinput': 'no-input', 660 '--n': 'no-input', 661 '--nil': 'no-input', 662 '--no-input': 'no-input', 663 '--noinput': 'no-input', 664 '--none': 'no-input', 665 '--None': 'no-input', 666 '--null': 'no-input', 667 '--null-input': 'no-input', 668 '--nullinput': 'no-input', 669 '-a': 'all-lines', 670 '-all': 'all-lines', 671 '-lines': 'all-lines', 672 '--a': 'all-lines', 673 '--all': 'all-lines', 674 '--lines': 'all-lines', 675 '-b': 'each-block', 676 '-blocks': 'each-block', 677 '-g': 'each-block', 678 '-groups': 'each-block', 679 '-p': 'each-block', 680 '-par': 'each-block', 681 '-para': 'each-block', 682 '-paragraphs': 'each-block', 683 '--b': 'each-block', 684 '--blocks': 'each-block', 685 '--g': 'each-block', 686 '--groups': 'each-block', 687 '--p': 'each-block', 688 '--par': 'each-block', 689 '--para': 'each-block', 690 '--paragraphs': 'each-block', 691 '-bytes': 'bytes', 692 '--bytes': 'bytes', 693 '-jl': 'json-lines', 694 '-jsonl': 'json-lines', 695 '-jsonlines': 'json-lines', 696 '-json-lines': 'json-lines', 697 '--jl': 'json-lines', 698 '--jsonl': 'json-lines', 699 '--jsonlines': 'json-lines', 700 '--json-lines': 'json-lines', 701 '-s': 'whole-strings', 702 '-str': 'whole-strings', 703 '-string': 'whole-strings', 704 '--s': 'whole-strings', 705 '--str': 'whole-strings', 706 '--string': 'whole-strings', 707 '-w': 'whole-strings', 708 '-whole': 'whole-strings', 709 '--w': 'whole-strings', 710 '--whole': 'whole-strings', 711 } 712 713 714 def blue(s: Any) -> str: 715 'Blue-style a plain string via ANSI-style sequences.' 716 return f'\x1b[38;5;26m{s}\x1b[0m' 717 718 def blueback(s: Any) -> str: 719 'Blue-background-style a plain string via ANSI-style sequences.' 720 return f'\x1b[48;5;26m\x1b[38;5;255m{s}\x1b[0m' 721 722 bluebg = blueback 723 724 def bold(s: Any) -> str: 725 'Bold-style a plain string via ANSI-style sequences.' 726 return f'\x1b[1m{s}\x1b[0m' 727 728 def gbm(s: str, good: Any = False, bad: Any = False, meh: Any = False) -> str: 729 ''' 730 Good, Bad, Meh ANSI-styles a plain string via ANSI-style sequences, 731 according to 1..3 conditions given as boolean(ish) values: these are 732 checked in order, so the first truish one wins. 733 ''' 734 735 if good: 736 return green(s) 737 if bad: 738 return red(s) 739 if meh: 740 return gray(s) 741 return s 742 743 def gray(s: Any) -> str: 744 'Gray-style a plain string via ANSI-style sequences.' 745 return f'\x1b[38;5;248m{s}\x1b[0m' 746 747 def grayback(s: Any) -> str: 748 'Gray-background-style a plain string via ANSI-style sequences.' 749 return f'\x1b[48;5;253m{s}\x1b[0m' 750 751 graybg = grayback 752 753 def green(s: Any) -> str: 754 'Green-style a plain string via ANSI-style sequences.' 755 return f'\x1b[38;5;29m{s}\x1b[0m' 756 757 def greenback(s: Any) -> str: 758 'Green-background-style a plain string via ANSI-style sequences.' 759 return f'\x1b[48;5;29m\x1b[38;5;255m{s}\x1b[0m' 760 761 greenbg = greenback 762 763 def highlight(s: Any) -> str: 764 'Highlight/reverse-style a plain string via ANSI-style sequences.' 765 return f'\x1b[7m{s}\x1b[0m' 766 767 hilite = highlight 768 769 def magenta(s: Any) -> str: 770 'Magenta-style a plain string via ANSI-style sequences.' 771 return f'\x1b[38;5;165m{s}\x1b[0m' 772 773 def magentaback(s: Any) -> str: 774 'Magenta-background-style a plain string via ANSI-style sequences.' 775 return f'\x1b[48;5;165m\x1b[38;5;255m{s}\x1b[0m' 776 777 magback = magentaback 778 magbg = magentaback 779 magentabg = magentaback 780 781 def orange(s: Any) -> str: 782 'Orange-style a plain string via ANSI-style sequences.' 783 return f'\x1b[38;5;166m{s}\x1b[0m' 784 785 def orangeback(s: Any) -> str: 786 'Orange-background-style a plain string via ANSI-style sequences.' 787 return f'\x1b[48;5;166m\x1b[38;5;255m{s}\x1b[0m' 788 789 orangebg = orangeback 790 orback = orangeback 791 orbg = orangeback 792 793 def purple(s: Any) -> str: 794 'Purple-style a plain string via ANSI-style sequences.' 795 return f'\x1b[38;5;99m{s}\x1b[0m' 796 797 def purpleback(s: Any) -> str: 798 'Purple-background-style a plain string via ANSI-style sequences.' 799 return f'\x1b[48;5;99m\x1b[38;5;255m{s}\x1b[0m' 800 801 purback = purpleback 802 purbg = purpleback 803 purplebg = purpleback 804 805 def red(s: Any) -> str: 806 'Red-style a plain string via ANSI-style sequences.' 807 return f'\x1b[38;5;1m{s}\x1b[0m' 808 809 def redback(s: Any) -> str: 810 'Red-background-style a plain string via ANSI-style sequences.' 811 return f'\x1b[48;5;1m\x1b[38;5;255m{s}\x1b[0m' 812 813 redbg = redback 814 815 def underline(s: Any) -> str: 816 'Underline-style a plain string via ANSI-style sequences.' 817 return f'\x1b[4m{s}\x1b[0m' 818 819 820 def realign(lines: List[str], gap: int = 2) -> Iterable: 821 ''' 822 Pad lines so that their items align across/vertically: extra padding 823 is put between such `columns`, using 2 spaces by default. 824 ''' 825 826 widths: List[int] = [] 827 for l in lines: 828 items = awk_sep_re.split(l.strip()) 829 while len(widths) < len(items): 830 widths.append(0) 831 for i, s in enumerate(items): 832 widths[i] = max(widths[i], len(s)) 833 834 sb = StringIO() 835 gap = max(gap, 0) 836 837 for l in lines: 838 sb.truncate(0) 839 sb.seek(0) 840 841 padding = 0 842 items = awk_sep_re.split(l.strip()) 843 for s, w in zip(items, widths): 844 sb.write(padding * ' ') 845 sb.write(s) 846 padding = max(w - len(s), 0) + gap 847 848 yield sb.getvalue() 849 850 851 def stop_normal(x: Any, exit_code: int = 0) -> NoReturn: 852 show_result(stdout, fix_value(x, None)) 853 exit(exit_code) 854 855 856 def stop_json(x: Any, exit_code: int = 0) -> NoReturn: 857 dump(x, stdout) 858 stdout.write('\n') 859 stdout.flush() 860 exit(exit_code) 861 862 863 from base64 import \ 864 standard_b64encode, standard_b64decode, \ 865 standard_b64encode as base64bytes, standard_b64decode as debase64bytes 866 867 from collections import \ 868 ChainMap, Counter, defaultdict, deque, namedtuple, OrderedDict, \ 869 UserDict, UserList, UserString 870 871 from copy import copy, deepcopy 872 873 from datetime import \ 874 MAXYEAR, MINYEAR, date, datetime, time, timedelta, timezone, tzinfo 875 try: 876 from datetime import now, UTC 877 except Exception: 878 now = lambda: datetime(2000, 1, 1).now() 879 880 from decimal import Decimal, getcontext 881 882 from difflib import \ 883 context_diff, diff_bytes, Differ, get_close_matches, HtmlDiff, \ 884 IS_CHARACTER_JUNK, IS_LINE_JUNK, ndiff, restore, SequenceMatcher, \ 885 unified_diff 886 887 from fractions import Fraction 888 889 import functools 890 from functools import \ 891 cache, cached_property, cmp_to_key, get_cache_token, lru_cache, \ 892 namedtuple, partial, partialmethod, recursive_repr, reduce, \ 893 singledispatch, singledispatchmethod, total_ordering, update_wrapper, \ 894 wraps 895 896 from glob import glob, iglob 897 898 try: 899 from graphlib import CycleError, TopologicalSorter 900 except Exception: 901 pass 902 903 from hashlib import \ 904 file_digest, md5, pbkdf2_hmac, scrypt, sha1, sha224, sha256, sha384, \ 905 sha512 906 907 from inspect import getfullargspec, getsource 908 909 import itertools 910 from itertools import \ 911 accumulate, chain, combinations, combinations_with_replacement, \ 912 compress, count, cycle, dropwhile, filterfalse, groupby, islice, \ 913 permutations, product, repeat, starmap, takewhile, tee, zip_longest 914 try: 915 from itertools import pairwise 916 from itertools import batched 917 except Exception: 918 pass 919 920 from json import dump, dumps, loads 921 922 import math 923 Math = math 924 from math import \ 925 acos, acosh, asin, asinh, atan, atan2, atanh, ceil, comb, \ 926 copysign, cos, cosh, degrees, dist, e, erf, erfc, exp, expm1, \ 927 fabs, factorial, floor, fmod, frexp, fsum, gamma, gcd, hypot, inf, \ 928 isclose, isfinite, isinf, isnan, isqrt, lcm, ldexp, lgamma, log, \ 929 log10, log1p, log2, modf, nan, nextafter, perm, pi, pow, prod, \ 930 radians, remainder, sin, sinh, sqrt, tan, tanh, tau, trunc, ulp 931 try: 932 from math import cbrt, exp2 933 except Exception: 934 pass 935 936 power = pow 937 938 import operator 939 940 from pathlib import Path 941 942 from pprint import \ 943 isreadable, isrecursive, pformat, pp, pprint, PrettyPrinter, saferepr 944 945 from random import \ 946 betavariate, choice, choices, expovariate, gammavariate, gauss, \ 947 getrandbits, getstate, lognormvariate, normalvariate, paretovariate, \ 948 randbytes, randint, random, randrange, sample, seed, setstate, \ 949 shuffle, triangular, uniform, vonmisesvariate, weibullvariate 950 951 compile_py = compile # keep built-in func compile for later 952 from re import compile as compile_uncached, Pattern, IGNORECASE 953 954 import statistics 955 from statistics import \ 956 bisect_left, bisect_right, fmean, \ 957 geometric_mean, harmonic_mean, mean, median, \ 958 median_grouped, median_high, median_low, mode, multimode, pstdev, \ 959 pvariance, quantiles, stdev, variance 960 try: 961 from statistics import \ 962 correlation, covariance, linear_regression, mul 963 except Exception: 964 pass 965 966 import string 967 from string import \ 968 Formatter, Template, ascii_letters, ascii_lowercase, ascii_uppercase, \ 969 capwords, digits, hexdigits, octdigits, printable, punctuation, \ 970 whitespace 971 972 alphabet = ascii_letters 973 letters = ascii_letters 974 lowercase = ascii_lowercase 975 uppercase = ascii_uppercase 976 977 from textwrap import dedent, fill, indent, shorten, wrap 978 979 from time import \ 980 altzone, asctime, \ 981 ctime, daylight, get_clock_info, \ 982 gmtime, localtime, mktime, monotonic, monotonic_ns, perf_counter, \ 983 perf_counter_ns, process_time, process_time_ns, \ 984 sleep, strftime, strptime, struct_time, thread_time, thread_time_ns, \ 985 time, time_ns, timezone, tzname 986 try: 987 from time import \ 988 clock_getres, clock_gettime, clock_gettime_ns, clock_settime, \ 989 clock_settime_ns, pthread_getcpuclockid, tzset 990 except Exception: 991 pass 992 993 from unicodedata import \ 994 bidirectional, category, combining, decimal, decomposition, digit, \ 995 east_asian_width, is_normalized, lookup, mirrored, name, normalize, \ 996 numeric 997 998 from urllib.parse import \ 999 parse_qs, parse_qsl, quote, quote_from_bytes, quote_plus, unquote, \ 1000 unquote_plus, unquote_to_bytes, unwrap, urldefrag, urlencode, urljoin, \ 1001 urlparse, urlsplit, urlunparse, urlunsplit 1002 1003 1004 class Skip: 1005 'Custom type which some funcs type-check to skip values in containers.' 1006 1007 def __init__(self, *args) -> None: 1008 pass 1009 1010 1011 # skip is a ready-to-use value which some funcs filter against: this way 1012 # filtering values becomes a special case of transforming values 1013 skip = Skip() 1014 1015 # re_cache is used by custom func compile to cache previously-compiled 1016 # regular-expressions, which makes them quicker to (re)use in formulas 1017 re_cache: Dict[str, Pattern] = {} 1018 1019 # ire_cache is like re_cache, except it's for case-insensitive regexes 1020 ire_cache: Dict[str, Pattern] = {} 1021 1022 # ansi_style_re detects the most commonly-used ANSI-style sequences, and 1023 # is used in func plain 1024 ansi_style_re = compile_uncached('''\x1b\[([0-9;]+m|[0-9]*[A-HJKST])''') 1025 1026 # number_re detects numbers, and is used in func numbers 1027 number_re = compile_uncached('''\W(-?[0-9]+(\.[0-9]*)?)\W''') 1028 1029 # link_re detects web links, and is used in func links 1030 link_re_src = 'https?://[A-Za-z0-9+_.:%-]+(/[A-Za-z0-9+_.%/,#?&=-]*)*' 1031 link_re = compile_uncached(link_re_src) 1032 1033 # paddable_tab_re detects single tabs and possible runs of spaces around 1034 # them, and is used in func squeeze 1035 paddable_tab_re = compile_uncached(' *\t *') 1036 1037 # seen remembers values already given to func `once` 1038 seen = set() 1039 1040 # commented_re detects strings/lines which start as unix-style comments 1041 commented_re = compile_uncached('^ *#') 1042 1043 # emptyish_re detects empty/emptyish strings/lines, the latter being strings 1044 # with only spaces in them 1045 emptyish_re = compile_uncached('^ *\r?\n?$') 1046 1047 # spaces_re detects runs of 2 or more spaces, and is used in func squeeze 1048 spaces_re = compile_uncached(' +') 1049 1050 # awk_sep_re splits like AWK does by default, and is used in func fields 1051 awk_sep_re = compile_uncached(' *\t *| +') 1052 1053 1054 # some convenience aliases to commonly-used values 1055 1056 false = False 1057 true = True 1058 nil = None 1059 nihil = None 1060 none = None 1061 null = None 1062 s = '' 1063 1064 months = [ 1065 'January', 'February', 'March', 'April', 'May', 'June', 1066 'July', 'August', 'September', 'October', 'November', 'December', 1067 ] 1068 1069 monweek = [ 1070 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 1071 'Saturday', 'Sunday', 1072 ] 1073 1074 sunweek = [ 1075 'Sunday', 1076 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 1077 ] 1078 1079 phy = { 1080 'kilo': 1_000, 1081 'mega': 1_000_000, 1082 'giga': 1_000_000_000, 1083 'tera': 1_000_000_000_000, 1084 'peta': 1_000_000_000_000_000, 1085 'exa': 1_000_000_000_000_000_000, 1086 'zetta': 1_000_000_000_000_000_000_000, 1087 1088 'c': 299_792_458, 1089 'kcd': 683, 1090 'na': 602214076000000000000000, 1091 1092 'femto': 1e-15, 1093 'pico': 1e-12, 1094 'nano': 1e-9, 1095 'micro': 1e-6, 1096 'milli': 1e-3, 1097 1098 'e': 1.602176634e-19, 1099 'f': 96_485.33212, 1100 'h': 6.62607015e-34, 1101 'k': 1.380649e-23, 1102 'mu': 1.66053906892e-27, 1103 1104 'ge': 9.7803267715, 1105 'gn': 9.80665, 1106 } 1107 1108 physics = phy 1109 1110 # using literal strings on the cmd-line is often tricky/annoying: some of 1111 # these aliases can help get around multiple levels of string-quoting; no 1112 # quotes are needed as the script will later make these values accessible 1113 # via the property/dot syntax 1114 sym = { 1115 'amp': '&', 1116 'ampersand': '&', 1117 'ansiclear': '\x1b[0m', 1118 'ansinormal': '\x1b[0m', 1119 'ansireset': '\x1b[0m', 1120 'apo': '\'', 1121 'apos': '\'', 1122 'ast': '*', 1123 'asterisk': '*', 1124 'at': '@', 1125 'backquote': '`', 1126 'backslash': '\\', 1127 'backtick': '`', 1128 'ball': '●', 1129 'bang': '!', 1130 'bigsigma': 'Σ', 1131 'block': '█', 1132 'bquo': '`', 1133 'bquote': '`', 1134 'bslash': '\\', 1135 'btick': '`', 1136 'bullet': '•', 1137 'caret': '^', 1138 'cdot': '·', 1139 'circle': '●', 1140 'colon': ':', 1141 'comma': ',', 1142 'cr': '\r', 1143 'crlf': '\r\n', 1144 'cross': '×', 1145 'cs': ', ', 1146 'dash': '—', 1147 'dollar': '$', 1148 'dot': '.', 1149 'dquo': '"', 1150 'dquote': '"', 1151 'emark': '!', 1152 'emdash': '—', 1153 'empty': '', 1154 'endash': '–', 1155 'eq': '=', 1156 'et': '&', 1157 'euro': '€', 1158 'ge': '≥', 1159 'geq': '≥', 1160 'gt': '>', 1161 'hellip': '…', 1162 'hole': '○', 1163 'hyphen': '-', 1164 'infinity': '∞', 1165 'lcurly': '{', 1166 'ldquo': '“', 1167 'ldquote': '“', 1168 'le': '≤', 1169 'leq': '≤', 1170 'lf': '\n', 1171 'lt': '<', 1172 'mdash': '—', 1173 'mdot': '·', 1174 'miniball': '•', 1175 'minus': '-', 1176 'ndash': '–', 1177 'neq': '≠', 1178 'perc': '%', 1179 'percent': '%', 1180 'period': '.', 1181 'plus': '+', 1182 'qmark': '?', 1183 'que': '?', 1184 'rcurly': '}', 1185 'rdquo': '”', 1186 'rdquote': '”', 1187 'sball': '•', 1188 'semi': ';', 1189 'semicolon': ';', 1190 'sharp': '#', 1191 'slash': '/', 1192 'space': ' ', 1193 'square': '■', 1194 'squo': '\'', 1195 'squote': '\'', 1196 'tab': '\t', 1197 'tilde': '~', 1198 'underscore': '_', 1199 'uscore': '_', 1200 'utf8bom': '\xef\xbb\xbf', 1201 'utf16be': '\xfe\xff', 1202 'utf16le': '\xff\xfe', 1203 } 1204 1205 symbols = sym 1206 1207 units = { 1208 'cup2l': 0.23658824, 1209 'floz2l': 0.0295735295625, 1210 'floz2ml': 29.5735295625, 1211 'ft2m': 0.3048, 1212 'gal2l': 3.785411784, 1213 'in2cm': 2.54, 1214 'lb2kg': 0.45359237, 1215 'mi2km': 1.609344, 1216 'mpg2kpl': 0.425143707, 1217 'nmi2km': 1.852, 1218 'oz2g': 28.34952312, 1219 'psi2pa': 6894.757293168, 1220 'ton2kg': 907.18474, 1221 'yd2m': 0.9144, 1222 1223 'mol': 602214076000000000000000, 1224 'mole': 602214076000000000000000, 1225 1226 'hour': 3_600, 1227 'day': 86_400, 1228 'week': 604_800, 1229 1230 'hr': 3_600, 1231 'wk': 604_800, 1232 1233 'kb': 1024, 1234 'mb': 1024**2, 1235 'gb': 1024**3, 1236 'tb': 1024**4, 1237 'pb': 1024**5, 1238 } 1239 1240 # some convenience aliases to various funcs from the python stdlib 1241 geomean = geometric_mean 1242 harmean = harmonic_mean 1243 sd = stdev 1244 popsd = pstdev 1245 var = variance 1246 popvar = pvariance 1247 randbeta = betavariate 1248 randexp = expovariate 1249 randgamma = gammavariate 1250 randlognorm = lognormvariate 1251 randnorm = normalvariate 1252 randweibull = weibullvariate 1253 1254 capitalize = str.capitalize 1255 casefold = str.casefold 1256 center = str.center 1257 # count = str.count 1258 decode = bytes.decode 1259 encode = str.encode 1260 endswith = str.endswith 1261 expandtabs = str.expandtabs 1262 find = str.find 1263 format = str.format 1264 index = str.index 1265 isalnum = str.isalnum 1266 isalpha = str.isalpha 1267 isascii = str.isascii 1268 isdecimal = str.isdecimal 1269 isdigit = str.isdigit 1270 isidentifier = str.isidentifier 1271 islower = str.islower 1272 isnumeric = str.isnumeric 1273 isprintable = str.isprintable 1274 isspace = str.isspace 1275 istitle = str.istitle 1276 isupper = str.isupper 1277 # join = str.join 1278 ljust = str.ljust 1279 lower = str.lower 1280 lowered = str.lower 1281 lstrip = str.lstrip 1282 maketrans = str.maketrans 1283 partition = str.partition 1284 removeprefix = str.removeprefix 1285 removesuffix = str.removesuffix 1286 replace = str.replace 1287 rfind = str.rfind 1288 rindex = str.rindex 1289 rjust = str.rjust 1290 rpartition = str.rpartition 1291 rsplit = str.rsplit 1292 rstrip = str.rstrip 1293 # split = str.split 1294 splitlines = str.splitlines 1295 startswith = str.startswith 1296 strip = str.strip 1297 swapcase = str.swapcase 1298 title = str.title 1299 translate = str.translate 1300 upper = str.upper 1301 uppered = str.upper 1302 zfill = str.zfill 1303 1304 every = all 1305 rev = reversed 1306 reverse = reversed 1307 some = any 1308 1309 length = len 1310 1311 blowtabs = str.expandtabs 1312 hasprefix = str.startswith 1313 hassuffix = str.endswith 1314 ltrim = str.lstrip 1315 stripstart = str.lstrip 1316 trimspace = str.strip 1317 trimstart = str.lstrip 1318 rtrim = str.rstrip 1319 stripend = str.rstrip 1320 trimend = str.rstrip 1321 stripped = str.strip 1322 trim = str.strip 1323 trimmed = str.strip 1324 trimprefix = str.removeprefix 1325 trimsuffix = str.removesuffix 1326 1327 1328 def required_arg_count(f: Callable) -> int: 1329 if isinstance(f, type): 1330 return 1 1331 1332 meta = getfullargspec(f) 1333 n = len(meta.args) 1334 if meta.defaults: 1335 n -= len(meta.defaults) 1336 return n 1337 1338 1339 def identity(x: Any) -> Any: 1340 ''' 1341 Return the value given: this is the default transformer for several 1342 higher-order funcs, which effectively keeps original items as given. 1343 ''' 1344 return x 1345 1346 idem = identity 1347 iden = identity 1348 1349 1350 def after(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 1351 'Skip parts of strings/sequences up to the substring/value given.' 1352 return (strafter if isinstance(x, str) else itemsafter)(x, what) 1353 1354 def afterlast(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 1355 'Skip parts of strings/sequences up to the last substring/value given.' 1356 return (strafterlast if isinstance(x, str) else itemsafterlast)(x, what) 1357 1358 afterfinal = afterlast 1359 1360 def arrayish(x: Any) -> bool: 1361 'Check if a value is array-like enough.' 1362 return isinstance(x, (list, tuple, range, Generator)) 1363 1364 isarrayish = arrayish 1365 1366 def base64(x): 1367 return base64bytes(str(x).encode()).decode() 1368 1369 def basename(s: str) -> str: 1370 'Get a filepath\'s last part, if present.' 1371 return Path(s).name 1372 1373 def before(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 1374 'End strings/sequences right before a substring/value\'s appearance.' 1375 return (strbefore if isinstance(x, str) else itemsbefore)(x, what) 1376 1377 def beforelast(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 1378 'End strings/sequences right before a substring/value\'s last appearance.' 1379 return (strbeforelast if isinstance(x, str) else itemsbeforelast)(x, what) 1380 1381 beforefinal = beforelast 1382 1383 def cases(x: Any, *args: Any) -> Any: 1384 ''' 1385 Simulate a switch statement on a value, using matches/result pairs from 1386 the arguments given; when given an even number of extra args, None is 1387 used as a final fallback result; when given an odd number of extra args, 1388 the last argument is used as a final `default` value, if needed. 1389 ''' 1390 1391 for i in range(0, len(args) - len(args) % 2, 2): 1392 test, res = args[i], args[i+1] 1393 if isinstance(test, (list, tuple)) and x in test: 1394 return res 1395 if isinstance(test, float) and isnan(test) and isnan(x): 1396 return res 1397 if x == test: 1398 return res 1399 return None if len(args) % 2 == 0 else args[-1] 1400 1401 switch = cases 1402 1403 def chunk(items: Iterable, chunk_size: int) -> Iterable: 1404 'Break iterable into chunks, each with up to the item-count given.' 1405 1406 if isinstance(items, str): 1407 n = len(items) 1408 while n >= chunk_size: 1409 yield items[:chunk_size] 1410 items = items[chunk_size:] 1411 n -= chunk_size 1412 if n > 0: 1413 yield items 1414 return 1415 1416 if not isinstance(chunk_size, int): 1417 raise Exception('non-integer chunk-size') 1418 if chunk_size < 1: 1419 raise Exception('non-positive chunk-size') 1420 1421 it = iter(items) 1422 while True: 1423 head = tuple(islice(it, chunk_size)) 1424 if not head: 1425 return 1426 yield head 1427 1428 chunked = chunk 1429 1430 def commented(s: str) -> bool: 1431 'Check if a string starts as a unix-style comment.' 1432 return commented_re.match(s) != None 1433 1434 iscommented = commented 1435 1436 def compile(s: str, case_sensitive: bool = True) -> Pattern: 1437 'Cached regex `compiler`, so it\'s quicker to (re)use in formulas.' 1438 1439 cache = re_cache if case_sensitive else ire_cache 1440 options = 0 if case_sensitive else IGNORECASE 1441 1442 if s in cache: 1443 return cache[s] 1444 e = compile_uncached(s, options) 1445 cache[s] = e 1446 return e 1447 1448 def compose(*what: Callable) -> Callable: 1449 def composite(x: Any) -> Any: 1450 for f in what: 1451 x = f(x) 1452 return x 1453 return composite 1454 1455 composed = compose 1456 lcompose = compose 1457 lcomposed = compose 1458 1459 def cond(*args: Any) -> Any: 1460 ''' 1461 Simulate a chain of if-else statements, using condition/result pairs 1462 from the arguments given; when given an even number of args, None is 1463 used as a final fallback result; when given an odd number of args, the 1464 last argument is used as a final `else` value, if needed. 1465 ''' 1466 1467 for i in range(0, len(args) - len(args) % 2, 2): 1468 if args[i]: 1469 return args[i+1] 1470 return None if len(args) % 2 == 0 else args[-1] 1471 1472 def conform(x: Any, denan: Any = None, deinf: Any = None, fn = str) -> Any: 1473 'Make values JSON-compatible.' 1474 1475 if isinstance(x, float): 1476 # turn NaNs and Infinities into the replacement values given 1477 if isnan(x): 1478 return denan 1479 if isinf(x): 1480 return deinf 1481 return x 1482 1483 if isinstance(x, (bool, int, float, str)): 1484 return x 1485 1486 if isinstance(x, dict): 1487 return { 1488 str(k): conform(v) for k, v in x.items() if not 1489 (isinstance(k, Skip) or isinstance(v, Skip)) 1490 } 1491 1492 if isinstance(x, Iterable): 1493 return [conform(e) for e in x if not isinstance(e, Skip)] 1494 1495 if isinstance(x, DotCallable): 1496 return x.value 1497 1498 return fn(x) 1499 1500 fix = conform 1501 1502 def countif(src: Iterable, check: Callable) -> int: 1503 ''' 1504 Count how many values make the func given true-like. This func works with 1505 sequences, dictionaries, and strings. 1506 ''' 1507 1508 if callable(src): 1509 src, check = check, src 1510 check = predicate(check) 1511 1512 total = 0 1513 if isinstance(src, dict): 1514 for v in src.values(): 1515 if check(v): 1516 total += 1 1517 else: 1518 for v in src: 1519 if check(v): 1520 total += 1 1521 return total 1522 1523 # def debase64(x): 1524 # return debase64bytes(str(x).encode()).decode() 1525 1526 def debase64(s: str) -> bytes: 1527 'Convert away from base64 encoding, including data-URIs.' 1528 1529 if s.startswith('data:'): 1530 i = s.find(',') 1531 if i >= 0: 1532 return standard_b64decode(s[i + 1:]) 1533 return standard_b64decode(s) 1534 1535 unbase64 = debase64 1536 1537 def dedup(v: Iterable) -> Iterable: 1538 'Ignore reappearing items from iterables, after their first occurrence.' 1539 1540 got = set() 1541 for e in v: 1542 if not e in got: 1543 got.add(e) 1544 yield e 1545 1546 dedupe = dedup 1547 deduped = dedup 1548 deduplicate = dedup 1549 deduplicated = dedup 1550 undup = dedup 1551 undupe = dedup 1552 unduped = dedup 1553 unduplicate = dedup 1554 unduplicated = dedup 1555 unique = dedup 1556 uniqued = dedup 1557 1558 def defunc(x: Any) -> Any: 1559 'Call if value is a func, or return it back as given.' 1560 return x() if callable(x) else x 1561 1562 callmemaybe = defunc 1563 defunct = defunc 1564 unfunc = defunc 1565 unfunct = defunc 1566 1567 def dejson(x: Any, catch: Union[Callable[[Exception], Any], Any] = None) -> Any: 1568 'Safely parse JSON from strings.' 1569 try: 1570 return loads(x) if isinstance(x, str) else x 1571 except Exception as e: 1572 return catch(e) if callable(catch) else catch 1573 1574 unjson = dejson 1575 1576 def denan(x: Any, fallback: Any = None) -> Any: 1577 'Replace floating-point NaN with the alternative value given.' 1578 return x if not (isinstance(x, float) and isnan(x)) else fallback 1579 1580 def denil(*args: Any) -> Any: 1581 'Avoid None values, if possible: first value which isn\'t None wins.' 1582 for e in args: 1583 if e != None: 1584 return e 1585 return None 1586 1587 denone = denil 1588 denull = denil 1589 1590 def dirname(s: str) -> str: 1591 'Ignore the last part of a filepath.' 1592 return str(Path(s).parent) 1593 1594 def dive(into: Any, doing: Callable) -> Any: 1595 'Transform a nested value by calling a func via depth-first recursion.' 1596 1597 # support args in either order 1598 if callable(into): 1599 into, doing = doing, into 1600 1601 return _dive_kv(None, into, doing) 1602 1603 deepmap = dive 1604 dive1 = dive 1605 1606 def divebin(x: Any, y: Any, doing: Callable) -> Any: 1607 'Nested 2-value version of depth-first-recursive func dive.' 1608 1609 # support args in either order 1610 if callable(x): 1611 x, y, doing = y, doing, x 1612 1613 narg = required_arg_count(doing) 1614 if narg == 2: 1615 return dive(x, lambda a: dive(y, lambda b: doing(a, b))) 1616 if narg == 4: 1617 return dive(x, lambda i, a: dive(y, lambda j, b: doing(i, a, j, b))) 1618 raise Exception('divebin(...) only supports funcs with 2 or 4 args') 1619 1620 bindive = divebin 1621 # diveboth = divebin 1622 # dualdive = divebin 1623 # duodive = divebin 1624 dive2 = divebin 1625 1626 def _dive_kv(key: Any, into: Any, doing: Callable) -> Any: 1627 if isinstance(into, dict): 1628 return {k: _dive_kv(k, v, doing) for k, v in into.items()} 1629 if isinstance(into, Iterable) and not isinstance(into, str): 1630 return [_dive_kv(i, e, doing) for i, e in enumerate(into)] 1631 1632 narg = required_arg_count(doing) 1633 return doing(key, into) if narg == 2 else doing(into) 1634 1635 class DotCallable: 1636 'Enable convenient dot-syntax calling of 1-input funcs.' 1637 1638 def __init__(self, value: Any): 1639 self.value = value 1640 1641 def __getattr__(self, key: str) -> Any: 1642 return DotCallable(globals()[key](self.value)) 1643 1644 class Dottable: 1645 'Enable convenient dot-syntax access to dictionary values.' 1646 1647 def __getattr__(self, key: Any) -> Any: 1648 return self.__dict__[key] if key in self.__dict__ else None 1649 1650 def __getitem__(self, key: Any) -> Any: 1651 return self.__dict__[key] if key in self.__dict__ else None 1652 1653 def __iter__(self) -> Iterable: 1654 return iter(self.__dict__) 1655 1656 def dotate(x: Any) -> Union[Dottable, Any]: 1657 'Recursively ensure all dictionaries in a value are dot-accessible.' 1658 1659 if isinstance(x, dict): 1660 d = Dottable() 1661 d.__dict__ = {k: dotate(v) for k, v in x.items()} 1662 return d 1663 if isinstance(x, list): 1664 return [dotate(e) for e in x] 1665 if isinstance(x, tuple): 1666 return tuple(dotate(e) for e in x) 1667 return x 1668 1669 dotated = dotate 1670 dote = dotate 1671 doted = dotate 1672 dotified = dotate 1673 dotify = dotate 1674 dottified = dotate 1675 dottify = dotate 1676 1677 # make dictionaries `physics`, `symbols`, and `units` easier to use 1678 phy = dotate(phy) 1679 physics = phy 1680 sym = dotate(sym) 1681 symbols = sym 1682 units = dotate(units) 1683 1684 def drop(src: Any, *what) -> Any: 1685 ''' 1686 Either ignore all substrings occurrences, or ignore all keys given from 1687 an object, or even from a sequence of objects. 1688 ''' 1689 1690 if isinstance(src, str): 1691 return strdrop(src, *what) 1692 return _itemsdrop(src, set(what)) 1693 1694 dropped = drop 1695 # ignore = drop 1696 # ignored = drop 1697 1698 def _itemsdrop(src: Any, what: Set) -> Any: 1699 if isinstance(src, dict): 1700 kv = {} 1701 for k, v in src.items(): 1702 if not (k in what): 1703 kv[k] = v 1704 return kv 1705 1706 if isinstance(src, Iterable): 1707 return [_itemsdrop(e, what) for e in src] 1708 1709 return None 1710 1711 def each(src: Iterable, f: Callable) -> Any: 1712 ''' 1713 A generalization of built-in func map, which can also handle dictionaries 1714 and strings. 1715 ''' 1716 1717 if callable(src): 1718 src, f = f, src 1719 1720 if isinstance(src, dict): 1721 return mapkv(src, lambda k, _: k, f) 1722 1723 if isinstance(src, str): 1724 s = StringIO() 1725 f = loopify(f) 1726 for i, c in enumerate(src): 1727 v = f(i, c) 1728 if not isinstance(v, Skip): 1729 s.write(str(v)) 1730 return s.getvalue() 1731 1732 return tuple(f(i, v) for i, v in enumerate(src)) 1733 1734 mapped = each 1735 1736 def emptyish(x: Any) -> bool: 1737 ''' 1738 Check if a value can be considered empty, which includes non-empty 1739 strings which only have spaces in them. 1740 ''' 1741 1742 def check(x: Any) -> bool: 1743 if not x: 1744 return True 1745 if isinstance(x, str): 1746 return bool(emptyish_re.match(x)) 1747 return False 1748 1749 if check(x): 1750 return True 1751 if isinstance(x, Iterable): 1752 return all(check(e) for e in x) 1753 return False 1754 1755 isemptyish = emptyish 1756 1757 def endict(x: Any) -> Dict[str, Any]: 1758 'Turn non-dictionary values into dictionaries with string keys.' 1759 1760 if isinstance(x, dict): 1761 return {str(k): v for k, v in x.items()} 1762 if arrayish(x): 1763 return {str(e): e for e in x} 1764 return {str(x): x} 1765 1766 dicted = endict 1767 endicted = endict 1768 indict = endict 1769 todict = endict 1770 1771 def enfloat(x: Any, fallback: float = nan) -> float: 1772 try: 1773 return float(x) 1774 except Exception: 1775 return fallback 1776 1777 enfloated = enfloat 1778 floated = enfloat 1779 floatify = enfloat 1780 floatize = enfloat 1781 tofloat = enfloat 1782 1783 def enint(x: Any, fallback: Any = None) -> Any: 1784 try: 1785 return int(x) 1786 except Exception: 1787 return fallback 1788 1789 eninted = enint 1790 inted = enint 1791 integered = enint 1792 intify = enint 1793 intize = enint 1794 toint = enint 1795 1796 def enlist(x: Any) -> List[Any]: 1797 'Turn non-list values into lists.' 1798 return list(x) if arrayish(x) else [x] 1799 1800 # inlist = enlist 1801 enlisted = enlist 1802 listify = enlist 1803 listize = enlist 1804 tolist = enlist 1805 1806 def entuple(x: Any) -> Tuple[Any, ...]: 1807 'Turn non-tuple values into tuples.' 1808 return tuple(x) if arrayish(x) else (x, ) 1809 1810 entupled = entuple 1811 ntuple = entuple 1812 ntupled = entuple 1813 tuplify = entuple 1814 tuplize = entuple 1815 toentuple = entuple 1816 tontuple = entuple 1817 totuple = entuple 1818 1819 def error(message: Any) -> Exception: 1820 return Exception(str(message)) 1821 1822 err = error 1823 1824 def ext(s: str) -> str: 1825 'Get a filepath\'s extension, if present.' 1826 1827 name = Path(s).name 1828 i = name.rfind('.') 1829 return name[i:] if i >= 0 else '' 1830 1831 filext = ext 1832 1833 def fail(message: Any, error_code: int = 255) -> NoReturn: 1834 stdout.flush() 1835 print(f'\x1b[31m{message}\x1b[0m', file=stderr) 1836 quit(error_code) 1837 1838 abort = fail 1839 bail = fail 1840 1841 def fields(s: str) -> Iterable[str]: 1842 'Split fields AWK-style from the string given.' 1843 return awk_sep_re.split(s.strip()) 1844 1845 # items = fields 1846 splitfields = fields 1847 splititems = fields 1848 words = fields 1849 1850 def first(items: SupportsIndex, fallback: Any = None) -> Any: 1851 return items[0] if len(items) > 0 else fallback 1852 1853 def flappend(*args: Any) -> List[Any]: 1854 'Turn arbitrarily-nested values/sequences into a single flat sequence.' 1855 1856 flat = [] 1857 def dig(x: Any) -> None: 1858 if arrayish(x): 1859 for e in x: 1860 dig(e) 1861 elif isinstance(x, dict): 1862 for e in x.values(): 1863 dig(e) 1864 else: 1865 flat.append(x) 1866 1867 for e in args: 1868 dig(e) 1869 return flat 1870 1871 def flat(*args: Any) -> Iterable: 1872 'Turn arbitrarily-nested values/sequences into a single flat sequence.' 1873 1874 def _flat_rec(x: Any) -> Iterable: 1875 if x is None: 1876 return 1877 1878 if isinstance(x, dict): 1879 yield from _flat_rec(x.values()) 1880 1881 if isinstance(x, str): 1882 yield x 1883 return 1884 1885 if isinstance(x, Iterable): 1886 for e in x: 1887 yield from _flat_rec(e) 1888 return 1889 1890 yield x 1891 1892 for x in args: 1893 yield from _flat_rec(x) 1894 1895 flatten = flat 1896 flattened = flat 1897 1898 def fromto(start, stop, f: Callable = identity) -> Iterable: 1899 'Sequence all integers between the numbers given, end-value included.' 1900 return (f(e) for e in range(start, stop + 1)) 1901 1902 def fuzz(x: Union[int, float]) -> Union[float, Dict[str, float]]: 1903 ''' 1904 Deapproximate numbers to their max range before approximation: the 1905 result is a dictionary with the guessed lower-bound number, the number 1906 given, and the guessed upper-bound number which can approximate to the 1907 original number given. NaNs and the infinities are returned as given, 1908 instead of resulting in a dictionary. 1909 ''' 1910 1911 if isnan(x) or isinf(x): 1912 return x 1913 1914 if x == 0: 1915 return {'-0.5': -0.5, '0': 0.0, '0.5': +0.5} 1916 1917 if x % 1 != 0: 1918 # return surrounding integers when given non-integers 1919 a = floor(x) 1920 b = ceil(x) 1921 return {str(a): a, str(x): x, str(b): b} 1922 1923 if x % 10 != 0: 1924 a = x - 0.5 1925 b = x + 0.5 1926 return {str(a): a, str(x): x, str(b): b} 1927 1928 # find the integer log10 of the absolute value; 0 was handled previously 1929 y = int(abs(x)) 1930 p10 = 1 1931 while True: 1932 if y % p10 != 0: 1933 p10 /= 10 1934 break 1935 p10 *= 10 1936 delta = p10 / 2 1937 1938 s = +1 if x > 0 else -1 1939 ux = abs(x) 1940 a = s * ux - delta 1941 b = s * ux + delta 1942 return {str(a): a, str(x): x, str(b): b} 1943 1944 def generated(src: Any) -> Any: 1945 'Make tuples out of generators, or return non-generator values as given.' 1946 return tuple(src) if isinstance(src, (Generator, range)) else src 1947 1948 concrete = generated 1949 concreted = generated 1950 concretize = generated 1951 concretized = generated 1952 degen = generated 1953 degenerate = generated 1954 degenerated = generated 1955 degenerator = generated 1956 gen = generated 1957 generate = generated 1958 synth = generated 1959 synthed = generated 1960 synthesize = generated 1961 synthesized = generated 1962 1963 def group(src: Iterable, by: Callable = identity) -> Dict: 1964 ''' 1965 Separate transformed items into arrays, the final result being a dict 1966 whose keys are all the transformed values, and whose values are lists 1967 of all the original values which did transform to their group's key. 1968 ''' 1969 1970 if callable(src): 1971 src, by = by, src 1972 1973 by = loopify(by) 1974 kv = src.items() if isinstance(src, dict) else enumerate(src) 1975 1976 groups = {} 1977 for k, v in kv: 1978 dk = by(k, v) 1979 if isinstance(dk, Skip) or isinstance(v, Skip): 1980 continue 1981 if dk in groups: 1982 groups[dk].append(v) 1983 else: 1984 groups[dk] = [v] 1985 return groups 1986 1987 grouped = group 1988 1989 def gire(src: Iterable[str], using: Iterable[str], fallback: Any = '') -> Dict: 1990 ''' 1991 Group matched items into arrays, the final result being a dict whose 1992 keys are all the matchable regexes given, and whose values are lists 1993 of all the original values which did case-insensitively match their 1994 group's key as a regex. 1995 ''' 1996 1997 using = tuple(using) 1998 return group(src, lambda x: imatch(x, using, fallback)) 1999 2000 gbire = gire 2001 groupire = gire 2002 2003 def gre(src: Iterable[str], using: Iterable[str], fallback: Any = '') -> Dict: 2004 ''' 2005 Group matched items into arrays, the final result being a dict whose 2006 keys are all the matchable regexes given, and whose values are lists 2007 of all the original values which did regex-match their group's key. 2008 ''' 2009 2010 using = tuple(using) 2011 return group(src, lambda x: match(x, using, fallback)) 2012 2013 gbre = gre 2014 groupre = gre 2015 2016 def gsub(s: str, what: str, repl: str) -> str: 2017 'Replace all regex-matches with the string given.' 2018 return compile(what).sub(repl, s) 2019 2020 def harden(f: Callable, fallback: Any = None) -> Callable: 2021 def _hardened_caller(*args): 2022 try: 2023 return f(*args) 2024 except Exception: 2025 return fallback 2026 return _hardened_caller 2027 2028 hardened = harden 2029 insure = harden 2030 insured = harden 2031 2032 def horner(coeffs: List[float], x: Union[int, float]) -> float: 2033 if isinstance(coeffs, (int, float)): 2034 coeffs, x = x, coeffs 2035 2036 if len(coeffs) == 0: 2037 return 0 2038 2039 y = coeffs[0] 2040 for c in islice(coeffs, 1, None): 2041 y *= x 2042 y += c 2043 return y 2044 2045 polyval = horner 2046 2047 def idiota(n: int, f: Callable = identity) -> Dict[int, int]: 2048 'ID (keys) version of func iota.' 2049 return { v: v for v in (f(e) for e in range(1, n + 1))} 2050 2051 dictiota = idiota 2052 kviota = idiota 2053 2054 def imatch(what: str, using: Iterable[str], fallback: str = '') -> str: 2055 'Try to case-insensitively match a string with any of the regexes given.' 2056 2057 if not isinstance(what, str): 2058 what, using = using, what 2059 2060 for s in using: 2061 expr = compile(s, False) 2062 m = expr.search(what) 2063 if m: 2064 # return what[m.start():m.end()] 2065 return s 2066 return fallback 2067 2068 def indices(x: Any) -> Iterable[Any]: 2069 'List all indices/keys, or get an exclusive range from an int.' 2070 2071 if isinstance(x, int): 2072 return range(x) 2073 if isinstance(x, dict): 2074 return x.keys() 2075 if isinstance(x, (str, list, tuple)): 2076 return range(len(x)) 2077 return tuple() 2078 2079 keys = indices 2080 2081 def ints(start, stop, f: Callable = identity) -> Iterable[int]: 2082 'Sequence integers, end-value included.' 2083 2084 if isnan(start) or isnan(stop) or isinf(start) or isinf(stop): 2085 return tuple() 2086 return (f(e) for e in range(int(ceil(start)), int(stop) + 1)) 2087 2088 integers = ints 2089 2090 def iota(n: int, f: Callable = identity) -> Iterable[int]: 2091 'Sequence all integers from 1 up to (and including) the int given.' 2092 return (f(e) for e in range(1, n + 1)) 2093 2094 def itemsafter(x: Iterable, what: Any) -> Iterable: 2095 ok = False 2096 check = predicate(what) 2097 for e in x: 2098 if ok: 2099 yield e 2100 elif check(e): 2101 ok = True 2102 2103 def itemsafterlast(x: Iterable, what: Any) -> Iterable: 2104 rest: List[Any] = [] 2105 check = predicate(what) 2106 for e in x: 2107 if check(e): 2108 rest.clear() 2109 else: 2110 rest.append(e) 2111 2112 for e in islice(rest, 1, len(rest)): 2113 yield e 2114 2115 def itemsbefore(x: Iterable, what: Any) -> Iterable: 2116 check = predicate(what) 2117 for e in x: 2118 if check(e): 2119 return 2120 yield e 2121 2122 def itemsbeforelast(x: Iterable, what: Any) -> Iterable: 2123 items = [] 2124 for e in x: 2125 items.append(e) 2126 2127 i = -1 2128 check = predicate(what) 2129 for j, e in enumerate(reversed(items)): 2130 if check(e): 2131 i = j 2132 break 2133 2134 if i < 0: 2135 return items 2136 if i == 0: 2137 return tuple() 2138 for e in islice(items, 0, i): 2139 yield e 2140 2141 def itemssince(x: Iterable, what: Any) -> Iterable: 2142 ok = False 2143 check = predicate(what) 2144 for e in x: 2145 ok = ok or check(e) 2146 if ok: 2147 yield e 2148 2149 def itemssincelast(x: Iterable, what: Any) -> Iterable: 2150 rest: List[Any] = [] 2151 check = predicate(what) 2152 for e in x: 2153 if check(e): 2154 rest.clear() 2155 else: 2156 rest.append(e) 2157 return rest 2158 2159 def itemsuntil(x: Iterable, what: Any) -> Iterable: 2160 check = predicate(what) 2161 for e in x: 2162 yield e 2163 if check(e): 2164 return 2165 2166 def itemsuntillast(x: Iterable, what: Any) -> Iterable: 2167 items = [] 2168 for e in x: 2169 items.append(e) 2170 2171 i = -1 2172 check = predicate(what) 2173 for j, e in enumerate(reversed(items)): 2174 if check(e): 2175 i = j 2176 break 2177 2178 if i < 0: 2179 return items 2180 for e in islice(items, 0, i + 1): 2181 yield e 2182 2183 itemsuntilfinal = itemsuntillast 2184 2185 def join(items: Iterable, sep: Union[str, Iterable] = ' ') -> Union[str, Dict]: 2186 ''' 2187 Join iterables using the separator-string given: its 2 arguments 2188 can come in either order, and are sorted out if needed. When given 2189 2 non-string iterables, the result is an object whose keys are from 2190 the first argument, and whose values are from the second one. 2191 2192 You can use it any of the following ways, where `keys` and `values` are 2193 sequences (lists, tuples, or generators), and `separator` is a string: 2194 2195 join(values) 2196 join(values, separator) 2197 join(separator, values) 2198 join(keys, values) 2199 ''' 2200 2201 if arrayish(items) and arrayish(sep): 2202 return {k: v for k, v in zip(items, sep)} 2203 if isinstance(items, str): 2204 items, sep = sep, items 2205 return sep.join(str(e) for e in items) 2206 2207 def joined_paragraphs(lines: Iterable[str]) -> Iterable[Sequence[str]]: 2208 ''' 2209 Regroup lines into individual paragraphs, each of which can span multiple 2210 lines: such paragraphs have no empty lines in them, and never end with a 2211 trailing line-feed. 2212 ''' 2213 2214 par: List[str] = [] 2215 for l in lines: 2216 if (not l) and par: 2217 yield '\n'.join(par) 2218 par.clear() 2219 else: 2220 par.append(l) 2221 2222 if len(par) > 0: 2223 yield '\n'.join(par) 2224 2225 def json0(x: Any) -> str: 2226 'Encode value into a minimal single-line JSON string.' 2227 return dumps(x, separators=(',', ':'), allow_nan=False, indent=None) 2228 2229 j0 = json0 2230 2231 def json2(x: Any) -> str: 2232 ''' 2233 Encode value into a (possibly multiline) JSON string, using 2 spaces for 2234 each indentation level. 2235 ''' 2236 return dumps(x, separators=(',', ': '), allow_nan=False, indent=2) 2237 2238 j2 = json2 2239 2240 def jsonl(x: Any) -> Iterable: 2241 'Turn value into multiple JSON-encoded strings, known as JSON Lines.' 2242 2243 if x is None: 2244 yield dumps(x, allow_nan=False) 2245 elif isinstance(x, (bool, int, float, dict, str)): 2246 yield dumps(x, allow_nan=False) 2247 elif isinstance(x, Iterable): 2248 for e in x: 2249 yield dumps(e, allow_nan=False) 2250 else: 2251 yield dumps(str(x), allow_nan=False) 2252 2253 jsonlines = jsonl 2254 ndjson = jsonl 2255 tojsonl = jsonl 2256 tojsonlines = jsonl 2257 2258 def keep(src: Iterable, pred: Any) -> Iterable: 2259 ''' 2260 A generalization of built-in func filter, which can also handle dicts and 2261 strings. 2262 ''' 2263 2264 if callable(src): 2265 src, pred = pred, src 2266 pred = predicate(pred) 2267 pred = loopify(pred) 2268 2269 if isinstance(src, str): 2270 out = StringIO() 2271 for i, c in enumerate(src): 2272 if pred(i, c): 2273 out.write(c) 2274 return out.getvalue() 2275 2276 if isinstance(src, dict): 2277 return { k: v for k, v in src.items() if pred(k, v) } 2278 return (e for i, e in enumerate(src) if pred(i, e)) 2279 2280 filtered = keep 2281 kept = keep 2282 2283 def last(items: SupportsIndex, fallback: Any = None) -> Any: 2284 return items[-1] if len(items) > 0 else fallback 2285 2286 def links(src: Any) -> Iterable: 2287 'Auto-detect all (HTTP/HTTPS) hyperlink-like substrings.' 2288 2289 if isinstance(src, str): 2290 for match in link_re.finditer(src): 2291 # yield src[match.start():match.end()] 2292 yield match.group(0) 2293 elif isinstance(src, dict): 2294 for k, v in src.items(): 2295 yield from k 2296 yield from links(v) 2297 elif isinstance(src, Iterable): 2298 for v in src: 2299 yield from links(v) 2300 2301 def loopify(x: Callable) -> Callable: 2302 nargs = required_arg_count(x) 2303 if nargs == 2: 2304 return x 2305 elif nargs == 1: 2306 return lambda _, v: x(v) 2307 else: 2308 raise Exception('only funcs with 1 or 2 args are supported') 2309 2310 def mapkv(src: Iterable, key: Callable, value: Callable = identity) -> Dict: 2311 ''' 2312 A map-like func for dictionaries, which uses 2 mapping funcs, the first 2313 for the keys, the second for the values. 2314 ''' 2315 2316 if key is None: 2317 key = lambda k, _: k 2318 2319 if callable(src): 2320 src, key, value = value, src, key 2321 2322 if required_arg_count(key) != 2: 2323 oldkey = key 2324 key = lambda k, _: oldkey(k) 2325 2326 key = loopify(key) 2327 value = loopify(value) 2328 # if isinstance(src, dict): 2329 # return { key(k, v): value(k, v) for k, v in src.items() } 2330 # return { key(i, v): value(i, v) for i, v in enumerate(src) } 2331 2332 def add(k, v, to): 2333 dk = key(k, v) 2334 dv = value(k, v) 2335 if isinstance(dk, Skip) or isinstance(dv, Skip): 2336 return 2337 to[dk] = dv 2338 2339 res = {} 2340 kv = src.items() if isinstance(src, dict) else enumerate(src) 2341 for k, v in kv: 2342 add(k, v, res) 2343 return res 2344 2345 def match(what: str, using: Iterable[str], fallback: str = '') -> str: 2346 'Try to match a string with any of the regexes given.' 2347 2348 if not isinstance(what, str): 2349 what, using = using, what 2350 2351 for s in using: 2352 expr = compile(s) 2353 m = expr.search(what) 2354 if m: 2355 # return what[m.start():m.end()] 2356 return s 2357 return fallback 2358 2359 def maybe(f: Callable, x: Any) -> Any: 2360 ''' 2361 Try calling a func on a value, using the same value as a fallback result, 2362 in case of exceptions. 2363 ''' 2364 2365 if not callable(f): 2366 f, x = x, f 2367 try: 2368 return f(x) 2369 except Exception: 2370 return x 2371 2372 def mappend(*args) -> Dict: 2373 kv = {} 2374 for src in args: 2375 if isinstance(src, dict): 2376 for k, v in src.items(): 2377 kv[k] = v 2378 else: 2379 raise Exception('mappend only works with dictionaries') 2380 return kv 2381 2382 def message(x: Any, result: Any = skip) -> Any: 2383 print(x, file=stderr) 2384 return result 2385 2386 msg = message 2387 2388 def must(cond: Any, errmsg: str = 'condition given not always true') -> None: 2389 'Enforce conditions, raising an exception on failure.' 2390 if not cond: 2391 raise Exception(errmsg) 2392 2393 demand = must 2394 enforce = must 2395 2396 def nowdict() -> dict: 2397 v = datetime(2000, 1, 1).now() 2398 return { 2399 'year': v.year, 2400 'month': v.month, 2401 'day': v.day, 2402 'hour': v.hour, 2403 'minute': v.minute, 2404 'second': v.second, 2405 'text': v.strftime('%Y-%m-%d %H:%M:%S %b %a'), 2406 'weekday': v.strftime('%A'), 2407 } 2408 2409 def number(x: Any) -> Union[int, float, Any]: 2410 ''' 2411 Try to turn the value given into a number, using a fallback value instead 2412 of raising exceptions. 2413 ''' 2414 2415 if isinstance(x, float): 2416 return x 2417 2418 try: 2419 return int(x) 2420 except Exception: 2421 return float(x) 2422 2423 def numbers(src: Any) -> Iterable: 2424 'Auto-detect all number-like substrings.' 2425 2426 if isinstance(src, str): 2427 for match in number_re.finditer(src): 2428 yield match.group(0).strip() 2429 # yield src[match.start():match.end()].strip() 2430 elif isinstance(src, dict): 2431 for k, v in src.items(): 2432 yield from k 2433 yield from links(v) 2434 elif isinstance(src, Iterable): 2435 for v in src: 2436 yield from links(v) 2437 2438 def numsign(x: Union[int, float]) -> Union[int, float]: 2439 'Get a number\'s sign, or NaN if the number given is a NaN.' 2440 2441 if isinstance(x, int): 2442 if x > 0: 2443 return +1 2444 if x < 0: 2445 return -1 2446 return 0 2447 2448 if isnan(x): 2449 return x 2450 2451 if x > 0: 2452 return +1.0 2453 if x < 0: 2454 return -1.0 2455 return 0.0 2456 2457 def numstats(src: Any) -> Dict[str, Union[float, int]]: 2458 'Gather several single-pass numeric statistics.' 2459 2460 n = mean_sq = ln_sum = 0 2461 least = +inf 2462 most = -inf 2463 total = mean = 0 2464 prod = 1 2465 nans = ints = pos = zero = neg = 0 2466 2467 def update_numstats(x: Any) -> None: 2468 nonlocal nans, n, ints, pos, neg, zero, least, most, total, prod 2469 nonlocal ln_sum, mean, mean_sq 2470 2471 if not isinstance(x, (float, int)): 2472 return 2473 2474 if isnan(x): 2475 nans += 1 2476 return 2477 2478 n += 1 2479 ints += int(isinstance(x, int) or x == floor(x)) 2480 2481 if x > 0: 2482 pos += 1 2483 elif x < 0: 2484 neg += 1 2485 else: 2486 zero += 1 2487 2488 least = min(least, x) 2489 most = max(most, x) 2490 2491 # total += x 2492 prod *= x 2493 ln_sum += log(x) 2494 2495 d1 = x - mean 2496 mean += d1 / n 2497 d2 = x - mean 2498 mean_sq += d1 * d2 2499 2500 def _numstats_rec(src: Any) -> None: 2501 if isinstance(src, dict): 2502 for e in src.values(): 2503 _numstats_rec(e) 2504 elif isinstance(src, Iterable) and not isinstance(src, str): 2505 for e in src: 2506 _numstats_rec(e) 2507 else: 2508 update_numstats(src) 2509 2510 _numstats_rec(src) 2511 2512 sd = nan 2513 geomean = nan 2514 if n > 0: 2515 sd = sqrt(mean_sq / n) 2516 geomean = exp(ln_sum / n) if not isinf(ln_sum) else nan 2517 total = n * mean 2518 2519 return { 2520 'n': n, 2521 'nan': nans, 2522 'min': least, 2523 'max': most, 2524 'sum': total, 2525 'mean': mean, 2526 'geomean': geomean, 2527 'sd': sd, 2528 'product': prod, 2529 'integer': ints, 2530 'positive': pos, 2531 'zero': zero, 2532 'negative': neg, 2533 } 2534 2535 def once(x: Any, replacement: Any = None) -> Any: 2536 ''' 2537 Replace the first argument given after the first time this func has been 2538 given it: this is a deliberately stateful function, given its purpose. 2539 ''' 2540 2541 if not (x in seen): 2542 seen.add(x) 2543 return x 2544 else: 2545 return replacement 2546 2547 onced = once 2548 2549 def pad(s: str, n: int, pad: str = ' ') -> str: 2550 l = len(s) 2551 return s if l >= n else s + int((n - l) / len(pad)) * pad 2552 2553 def padcenter(s: str, n: int, pad: str = ' ') -> str: 2554 return s.center(n, pad) 2555 2556 centerpad = padcenter 2557 centerpadded = padcenter 2558 cjust = padcenter 2559 cpad = padcenter 2560 padc = padcenter 2561 paddedcenter = padcenter 2562 2563 def padend(s: str, n: int, pad: str = ' ') -> str: 2564 return s.rjust(n, pad) 2565 2566 padr = padend 2567 padright = padend 2568 paddedend = padend 2569 paddedright = padend 2570 rpad = padend 2571 rightpad = padend 2572 rightpadded = padend 2573 2574 def padstart(s: str, n: int, pad: str = ' ') -> str: 2575 return s.ljust(n, pad) 2576 2577 lpad = padstart 2578 leftpad = padstart 2579 leftpadded = padstart 2580 padl = padstart 2581 padleft = padstart 2582 paddedleft = padstart 2583 paddedstart = padstart 2584 2585 def panic(x: Any) -> None: 2586 raise Exception(x) 2587 2588 def paragraphize(lines: Iterable[str]) -> Iterable[Sequence[str]]: 2589 ''' 2590 Regroup lines into individual paragraphs, each of which is a list of 2591 single-line strings, none of which never end with a trailing line-feed. 2592 ''' 2593 2594 par: List[str] = [] 2595 for l in lines: 2596 if (not l) and par: 2597 yield par 2598 par.clear() 2599 else: 2600 par.append(l) 2601 2602 if len(par) > 0: 2603 yield par 2604 2605 paragraphed = paragraphize 2606 paragraphs = paragraphize 2607 paragroup = paragraphize 2608 pargroup = paragraphize 2609 2610 def parse(s: str, fallback: Any = None) -> Any: 2611 'Try to parse JSON, ignoring exceptions in favor of a fallback value.' 2612 2613 try: 2614 return loads(s) 2615 except Exception: 2616 return fallback 2617 2618 fromjson = parse 2619 parsed = parse 2620 loaded = parse 2621 unjson = parse 2622 2623 def pick(src: Any, *what) -> Any: 2624 'Pick only the keys given from an object, or even a sequence of objects.' 2625 2626 if isinstance(src, dict): 2627 kv = {} 2628 for k in what: 2629 kv[k] = src[k] 2630 return kv 2631 2632 if isinstance(src, Iterable): 2633 return [pick(e, *what) for e in src] 2634 2635 return None 2636 2637 picked = pick 2638 2639 def plain(s: str) -> str: 2640 'Ignore all ANSI-style sequences in a string.' 2641 return ansi_style_re.sub('', s) 2642 2643 def predicate(x: Any) -> Callable: 2644 'Helps various higher-order funcs, by standardizing `predicate` values.' 2645 2646 if callable(x): 2647 return x 2648 2649 if isinstance(x, float): 2650 if isnan(x): 2651 return lambda y: isinstance(y, float) and isnan(y) 2652 if isinf(x): 2653 return lambda y: isinstance(y, float) and isinf(y) 2654 2655 return lambda y: x == y 2656 2657 pred = predicate 2658 2659 def quoted(s: str, quote: str = '"') -> str: 2660 'Surround a string with quotes.' 2661 return f'{quote}{s}{quote}' 2662 2663 def recover(*args) -> Any: 2664 ''' 2665 Catch exceptions using a lambda/callback func, in one of 6 ways 2666 recover(zero_args_func) 2667 recover(zero_args_func, exception_replacement_value) 2668 recover(zero_args_func, one_arg_exception_handling_func) 2669 recover(one_arg_func, arg) 2670 recover(one_arg_func, arg, exception_replacement_value) 2671 recover(one_arg_func, arg, one_arg_exception_handling_func) 2672 ''' 2673 2674 if len(args) == 1: 2675 f = args[0] 2676 try: 2677 return f() 2678 except Exception: 2679 return None 2680 elif len(args) == 2: 2681 f, fallback = args[0], args[1] 2682 if callable(f) and callable(fallback): 2683 try: 2684 return f() 2685 except Exception as e: 2686 nargs = required_arg_count(fallback) 2687 return fallback(e) if nargs == 1 else fallback() 2688 else: 2689 try: 2690 return f() if required_arg_count(f) == 0 else f(args[1]) 2691 except Exception: 2692 return fallback 2693 elif len(args) == 3: 2694 f, x, fallback = args[0], args[1], args[2] 2695 if callable(f) and callable(fallback): 2696 try: 2697 return f(x) 2698 except Exception as e: 2699 nargs = required_arg_count(fallback) 2700 return fallback(e) if nargs == 1 else fallback() 2701 else: 2702 try: 2703 return f(x) 2704 except Exception: 2705 return fallback 2706 else: 2707 raise Exception('recover(...) only works with 1, 2, or 3 args') 2708 2709 attempt = recover 2710 attempted = recover 2711 recovered = recover 2712 recoverred = recover 2713 rescue = recover 2714 rescued = recover 2715 trycall = recover 2716 2717 def reject(src: Iterable, pred: Any) -> Iterable: 2718 ''' 2719 A generalization of built-in func filter, which uses predicate funcs the 2720 opposite way, and which can also handle dicts and strings. 2721 ''' 2722 2723 if callable(src): 2724 src, pred = pred, src 2725 pred = predicate(pred) 2726 pred = loopify(pred) 2727 2728 if isinstance(src, str): 2729 out = StringIO() 2730 for i, c in enumerate(src): 2731 if not pred(i, c): 2732 out.write(c) 2733 return out.getvalue() 2734 2735 if isinstance(src, dict): 2736 return { k: v for k, v in src.items() if not pred(k, v) } 2737 return (e for i, e in enumerate(src) if not pred(i, e)) 2738 2739 avoid = reject 2740 avoided = reject 2741 keepout = reject 2742 keptout = reject 2743 rejected = reject 2744 2745 def retype(x: Any) -> Any: 2746 'Try to narrow the type of the value given.' 2747 2748 if isinstance(x, float): 2749 return int(x) if floor(x) == x else x 2750 2751 if not isinstance(x, str): 2752 return x 2753 2754 try: 2755 return loads(x) 2756 except Exception: 2757 pass 2758 2759 try: 2760 return int(x) 2761 except Exception: 2762 pass 2763 2764 try: 2765 return float(x) 2766 except Exception: 2767 pass 2768 2769 return x 2770 2771 autocast = retype 2772 mold = retype 2773 molded = retype 2774 narrow = retype 2775 narrowed = retype 2776 recast = retype 2777 recasted = retype 2778 remold = retype 2779 remolded = retype 2780 retyped = retype 2781 2782 def revcompose(*what: Callable) -> Callable: 2783 def composite(x: Any) -> Any: 2784 for f in reversed(what): 2785 x = f(x) 2786 return x 2787 return composite 2788 2789 rcompose = revcompose 2790 rcomposed = revcompose 2791 revcomposed = revcompose 2792 2793 def revsort(iterable: Iterable, key: Optional[Callable] = None) -> Iterable: 2794 return sorted(iterable, key=key, reverse=True) 2795 2796 revsorted = revsort 2797 2798 # def revsortkv(src: Dict, key: Callable = None) -> Dict: 2799 # if not key: 2800 # key = lambda kv: (kv[1], kv[0]) 2801 # return sortkv(src, key, reverse=True) 2802 2803 def revsortkv(src: Dict, key: Callable = None) -> Dict: 2804 if key is None: 2805 key = lambda x: x[1] 2806 return sortkv(src, key, reverse=True) 2807 2808 revsortedkv = revsortkv 2809 2810 def rstripdecs(s: str) -> str: 2811 ''' 2812 Ignore trailing zero decimals on number-like strings; even ignore 2813 the decimal dot if trailing. 2814 ''' 2815 2816 try: 2817 f = float(s) 2818 if isnan(f) or isinf(f): 2819 return s 2820 2821 dot = s.find('.') 2822 if dot < 0: 2823 return s 2824 2825 s = s.rstrip('0') 2826 return s[:-1] if s.endswith('.') else s 2827 except Exception: 2828 return s 2829 2830 chopdecs = rstripdecs 2831 2832 def scale(x: float, x0: float, x1: float, y0: float, y1: float) -> float: 2833 'Transform a value from a linear domain into another linear one.' 2834 return (y1 - y0) * (x - x0) / (x1 - x0) + y0 2835 2836 rescale = scale 2837 rescaled = scale 2838 scaled = scale 2839 2840 def shortened(s: str, maxlen: int, trailer: str = '') -> str: 2841 'Limit strings to the symbol-count given, including an optional trailer.' 2842 maxlen = max(maxlen, 0) 2843 return s if len(s) <= maxlen else s[:maxlen - len(trailer)] + trailer 2844 2845 def shuffled(x: Any) -> Any: 2846 'Return a shuffled copy of the list given.' 2847 y = copy(x) 2848 shuffle(y) 2849 return y 2850 2851 def split(src: Union[str, Sequence], n: Union[str, int]) -> Iterable: 2852 'Split/break a string/sequence into several chunks/parts.' 2853 2854 if isinstance(src, str) and isinstance(n, str): 2855 return src.split(n) 2856 if not (isinstance(src, (str, Sequence)) and isinstance(n, int)): 2857 raise Exception('unsupported type-pair of arguments') 2858 2859 if n < 1: 2860 return [] 2861 2862 l = len(src) 2863 if l <= n: 2864 return src.split('') if isinstance(src, str) else src 2865 2866 chunks = [] 2867 csize = int(ceil(l / n)) 2868 while len(src) > 0: 2869 chunks.append(src[:csize]) 2870 src = src[csize:] 2871 return chunks 2872 2873 broken = split 2874 splitted = split 2875 splitten = split 2876 2877 def strdrop(x: str, *what: str) -> str: 2878 'Ignore all occurrences of all substrings given.' 2879 2880 for s in what: 2881 x = x.replace(s, '') 2882 return x 2883 2884 strignore = strdrop 2885 2886 def stringify(x: Any) -> str: 2887 'Fancy alias for func dumps, named after JavaScript\'s func.' 2888 return dumps(x, separators=(', ', ': '), allow_nan=False, indent=None) 2889 2890 jsonate = stringify 2891 jsonify = stringify 2892 tojson = stringify 2893 2894 def strafter(x: str, what: str) -> str: 2895 i = x.find(what) 2896 return '' if i < 0 else x[i+len(what):] 2897 2898 def strafterlast(x: str, what: str) -> str: 2899 i = x.rfind(what) 2900 return '' if i < 0 else x[i+len(what):] 2901 2902 def strbefore(x: str, what: str) -> str: 2903 i = x.find(what) 2904 return x if i < 0 else x[:i] 2905 2906 def strbeforelast(x: str, what: str) -> str: 2907 i = x.rfind(what) 2908 return x if i < 0 else x[:i] 2909 2910 def strsince(x: str, what: str) -> str: 2911 i = x.find(what) 2912 return '' if i < 0 else x[i:] 2913 2914 def strsincelast(x: str, what: str) -> str: 2915 i = x.rfind(what) 2916 return '' if i < 0 else x[i:] 2917 2918 def struntil(x: str, what: str) -> str: 2919 i = x.find(what) 2920 return x if i < 0 else x[:i+len(what)] 2921 2922 def struntillast(x: str, what: str) -> str: 2923 i = x.rfind(what) 2924 return x if i < 0 else x[:i+len(what)] 2925 2926 struntilfinal = struntillast 2927 2928 def since(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 2929 'Start strings/sequences with a substring/value\'s appearance.' 2930 return (strsince if isinstance(x, str) else itemssince)(x, what) 2931 2932 def sincelast(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 2933 'Start strings/sequences with a substring/value\'s last appearance.' 2934 return (strsincelast if isinstance(x, str) else itemssincelast)(x, what) 2935 2936 sincefinal = sincelast 2937 2938 def sortk(x: Dict, key: Callable = identity, reverse: bool = False) -> Dict: 2939 keys = sorted(x.keys(), key=key, reverse=reverse) 2940 return {k: x[k] for k in keys} 2941 2942 sortkeys = sortk 2943 sortedkeys = sortk 2944 2945 def sortkv(src: Dict, key: Callable = None, reverse: bool = False) -> Dict: 2946 if key is None: 2947 key = lambda x: x[1] 2948 kv = sorted(src.items(), key=key, reverse=reverse) 2949 return {k: v for (k, v) in kv} 2950 2951 sortedkv = sortkv 2952 2953 def squeeze(s: str) -> str: 2954 ''' 2955 A more aggressive way to rid strings of extra spaces which, unlike string 2956 method strip, also turns inner runs of multiple spaces into single ones. 2957 ''' 2958 s = s.strip() 2959 s = spaces_re.sub(' ', s) 2960 s = paddable_tab_re.sub('\t', s) 2961 return s 2962 2963 squeezed = squeeze 2964 2965 def stround(x: Union[int, float], decimals: int = 6) -> str: 2966 'Format numbers into a string with the given decimal-digit count.' 2967 2968 if decimals >= 0: 2969 return f'{x:.{decimals}f}' 2970 else: 2971 return f'{round(x, decimals):.0f}' 2972 2973 def tally(src: Iterable, by: Callable = identity) -> Dict[Any, int]: 2974 ''' 2975 Count all distinct (transformed) values, the result being a dictionary 2976 whose keys are all the transformed values, and whose items are positive 2977 integers. 2978 ''' 2979 2980 if callable(src): 2981 src, by = by, src 2982 2983 tally: Dict[Any, int] = {} 2984 by = loopify(by) 2985 2986 if isinstance(src, dict): 2987 for k, v in src.items(): 2988 dk = by(k, v) 2989 if dk in tally: 2990 tally[dk] += 1 2991 else: 2992 tally[dk] = 1 2993 else: 2994 for i, v in enumerate(src): 2995 dk = by(i, v) 2996 if dk in tally: 2997 tally[dk] += 1 2998 else: 2999 tally[dk] = 1 3000 return tally 3001 3002 tallied = tally 3003 3004 def transpose(src: Any) -> Any: 3005 'Turn lists/objects inside-out like socks, so to speak.' 3006 3007 if isinstance(src, dict): 3008 return { v: k for k, v in src.items() } 3009 3010 if not arrayish(src): 3011 msg = 'transpose only supports objects or iterables of objects' 3012 raise ValueError(msg) 3013 3014 kv: Dict[Any, Any] = {} 3015 seq: List[Any] = [] 3016 3017 for e in src: 3018 if isinstance(e, dict): 3019 for k, v in e.items(): 3020 if k in kv: 3021 kv[k].append(v) 3022 else: 3023 kv[k] = [v] 3024 elif isinstance(e, Iterable): 3025 for i, v in enumerate(e): 3026 if i < len(seq): 3027 seq[i].append(v) 3028 else: 3029 seq.append([v]) 3030 else: 3031 msg = 'transpose(...): not all items are iterables/objects' 3032 raise ValueError(msg) 3033 3034 if len(kv) > 0 and len(seq) > 0: 3035 msg = 'transpose(...): mix of iterables and objects not supported' 3036 raise ValueError(msg) 3037 return kv if len(seq) == 0 else seq 3038 3039 tr = transpose 3040 transp = transpose 3041 transposed = transpose 3042 3043 def trap(x: Callable, y: Union[Callable[[Exception], Any], Any] = None) -> Any: 3044 'Try running a func, handing exceptions over to a fallback func.' 3045 3046 try: 3047 return x() if callable(x) else x 3048 except Exception as e: 3049 if callable(y): 3050 nargs = required_arg_count(y) 3051 return y(e) if nargs == 1 else y() 3052 else: 3053 return y 3054 3055 catch = trap 3056 catched = trap 3057 caught = trap 3058 noerr = trap 3059 noerror = trap 3060 noerrors = trap 3061 safe = trap 3062 save = trap 3063 saved = trap 3064 trapped = trap 3065 3066 def tsv(x: str, fn: Union[Callable, None] = None) -> Any: 3067 if fn is None: 3068 return x.split('\t') 3069 if callable(x): 3070 x, fn = fn, x 3071 return fn(x.split('\t')) 3072 3073 def typename(x: Any) -> str: 3074 if x is None: 3075 return 'null' 3076 if isinstance(x, bool): 3077 return 'boolean' 3078 if isinstance(x, str): 3079 return 'string' 3080 if isinstance(x, (int, float)): 3081 return 'number' 3082 if isinstance(x, (list, tuple)): 3083 return 'array' 3084 if isinstance(x, dict): 3085 return 'object' 3086 return type(x).__name__ 3087 3088 def typeof(x: Any) -> str: 3089 'Get a value\'s JS-like typeof type-string.' 3090 3091 if callable(x): 3092 return 'function' 3093 3094 return { 3095 bool: 'boolean', 3096 int: 'number', 3097 float: 'number', 3098 str: 'string', 3099 }.get(type(x), 'object') 3100 3101 def unixify(s: str) -> str: 3102 ''' 3103 Make plain-text `unix-style`, ignoring a leading UTF-8 BOM if present, 3104 and turning any/all CRLF byte-pairs into line-feed bytes. 3105 ''' 3106 s = s.lstrip('\xef\xbb\xbf') 3107 return s.replace('\r\n', '\n') if '\r\n' in s else s 3108 3109 def unquoted(s: str) -> str: 3110 'Ignore surrounding quotes in a string.' 3111 3112 if s.startswith('"') and s.endswith('"'): 3113 return s[1:-1] 3114 if s.startswith('\'') and s.endswith('\''): 3115 return s[1:-1] 3116 if s.startswith('`') and s.endswith('`'): 3117 return s[1:-1] 3118 if s.startswith('”') and s.endswith('“'): 3119 return s[1:-1] 3120 if s.startswith('“') and s.endswith('”'): 3121 return s[1:-1] 3122 return s 3123 3124 dequote = unquoted 3125 dequoted = unquoted 3126 3127 def until(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 3128 'End strings/sequences with a substring/value\'s appearance.' 3129 return (struntil if isinstance(x, str) else itemsuntil)(x, what) 3130 3131 def untillast(x: Union[str, Iterable], what: Any) -> Union[str, Iterable]: 3132 'End strings/sequences with a substring/value\'s last appearance.' 3133 return (struntillast if isinstance(x, str) else itemsuntillast)(x, what) 3134 3135 untilfinal = untillast 3136 3137 3138 def wait(seconds: Union[int, float], result: Any) -> Any: 3139 'Wait the given number of seconds, before returning its latter arg.' 3140 3141 t = (int, float) 3142 if (not isinstance(seconds, t)) and isinstance(result, t): 3143 seconds, result = result, seconds 3144 sleep(seconds) 3145 return result 3146 3147 delay = wait 3148 3149 def wat(*args) -> None: 3150 'What Are These (wat) shows help/doc messages for funcs given to it.' 3151 3152 from pydoc import doc 3153 3154 c = 0 3155 w = stderr 3156 3157 for e in args: 3158 if not callable(e): 3159 continue 3160 3161 if c > 0: 3162 print(file=w) 3163 3164 print(f'\x1b[48;5;253m\x1b[38;5;26m{e.__name__:80}\x1b[0m', file=w) 3165 doc(e, output=w) 3166 c += 1 3167 3168 return Skip() 3169 3170 def wit(*args) -> None: 3171 'What Is This (wit) shows help/doc messages for funcs given to it.' 3172 return wat(*args) 3173 3174 def zoom(x: Any, *keys_indices) -> Any: 3175 for k in keys_indices: 3176 # allow int-indexing dicts the same way lists/tuples can be 3177 if isinstance(x, dict) and isinstance(k, int): 3178 l = len(x) 3179 if i < 0: 3180 i += l 3181 if i < 0 or i >= len(x): 3182 x = None 3183 continue 3184 for i, e in enumerate(x.values()): 3185 if i == k: 3186 x = e 3187 break 3188 continue 3189 3190 # regular key/index access for dicts/lists/tuples 3191 x = x[k] 3192 3193 return x 3194 3195 3196 # args is the `proper` list of arguments given to the script 3197 args = argv[1:] 3198 run_mode = '' 3199 trace_exceptions = False 3200 profile_run = False 3201 3202 if len(args) == 0: 3203 # show help message when given no arguments 3204 print(info.strip(), file=stderr) 3205 exit(0) 3206 3207 trace_opts = ( 3208 '-t', '--t', '-trace', '--trace', '-traceback', '--traceback', 3209 ) 3210 profile_opts = ('-p', '--p', '-prof', '--prof', '-profile', '--profile') 3211 3212 # handle all other leading options; the explicit help options are 3213 # handled earlier in the script 3214 while len(args) > 0: 3215 if args[0] in trace_opts: 3216 trace_exceptions = True 3217 args = args[1:] 3218 continue 3219 3220 if args[0] in profile_opts: 3221 profile_run = True 3222 args = args[1:] 3223 continue 3224 3225 s = opts2modes.get(args[0], '') 3226 if not s: 3227 break 3228 3229 run_mode = s 3230 args = args[1:] 3231 3232 inputs = [] 3233 expression = '' 3234 if len(args) > 0: 3235 expression = args[0] 3236 inputs = args[1:] 3237 3238 if not run_mode: 3239 run_mode = 'each-line' 3240 3241 if not expression and not (run_mode in ('json-lines', 'each-line')): 3242 # show help message when given no expression 3243 print(info.strip(), file=stderr) 3244 exit(0) 3245 3246 glo = globals() 3247 for e in (physics, symbols, units): 3248 for k, v in e.__dict__.items(): 3249 if not k in glo: 3250 glo[k] = v 3251 3252 exec = disabled_exec 3253 3254 try: 3255 # compile the expression to speed it up, since they're all (re)run 3256 # for each line from standard input; also, handle a single-dot as 3257 # an identity expression, using the current line as is 3258 if expression in ('', '.'): 3259 expression = { 3260 'all-lines': 'lines', 3261 'all-bytes': 'data', 3262 'each-block': 'block', 3263 'each-line': 'line', 3264 'json-lines': 'data', 3265 'no-input': 'info.strip()', 3266 'whole-strings': 'value', 3267 }[run_mode] 3268 expression = compile_py(expression, expression, 'eval') 3269 3270 # `comprehension` expressions seem to ignore local variables: even 3271 # lambda-based workarounds fail 3272 i = 0 3273 c = 1 3274 nr = 1 3275 _ = None 3276 3277 fn = { 3278 'each-line': stop_normal, 3279 'each-block': stop_normal, 3280 'all-lines': stop_normal, 3281 'all-bytes': stop_normal, 3282 'json-lines': stop_json, 3283 'no-input': stop_normal, 3284 'whole-strings': stop_normal, 3285 }[run_mode] 3286 glo['halt'] = fn 3287 glo['stop'] = fn 3288 3289 fn = { 3290 'each-line': main_each_line, 3291 'each-block': main_each_block, 3292 'all-lines': main_all_lines, 3293 'all-bytes': main_all_bytes, 3294 'json-lines': main_json_lines, 3295 'no-input': main_no_input, 3296 'whole-strings': main_whole_strings, 3297 }[run_mode] 3298 3299 if fn is None: 3300 raise Exception(f'internal error: invalid run-mode {run_mode}') 3301 3302 if profile_run: 3303 from cProfile import Profile 3304 # using a profiler in a `with` context adds many irrelevant 3305 # entries to its output 3306 prof = Profile() 3307 prof.enable() 3308 fn(stdout, stdin, expression, inputs) 3309 prof.disable() 3310 prof.print_stats() 3311 else: 3312 fn(stdout, stdin, expression, inputs) 3313 except BrokenPipeError: 3314 # quit quietly, instead of showing a confusing error message 3315 stderr.close() 3316 except KeyboardInterrupt: 3317 exit(2) 3318 except Exception as e: 3319 if trace_exceptions: 3320 raise e 3321 s = str(e) 3322 s = s if s else '<generic exception>' 3323 print(f'\x1b[31m{s}\x1b[0m', file=stderr) 3324 exit(1)