Module paroxython.flatten_ast
Return a non-recursive, multiline dump of the AST (with some tweaks).
Motivation and purpose
Static analysis tools like Paroxython generally start with the AST (abstract syntax tree) of the program to analyze. For instance, when the string:
'print("hello, world")'
… is fed to the function parse()
of the module ast
,
the following object is returned:
Module()
|
| body[0]
|
Expr()
|
| value
|
Call()
/ \
func / \ args[0]
/ \
Name(id='print', ctx=Load()) Str(s='hello, world', kind='')
The purpose of our function flatten_ast()
is to transform this nested structure into a curated
pre-ordered sequence of text lines, each of which consisting in a key-value pair. A key is the
complete path from the root of the AST to either a node (e.g., Name()
) or a node attribute
(e.g., id
). The associated value comes after an equals sign (=
):
/_type=Module
/body/_length=1
/body/1/_type=Expr
/body/1/_pos=1:1-
/body/1/value/_type=Call
/body/1/value/_hash=0x0001
/body/1/value/_pos=1:1-0-
/body/1/value/func/_type=Name
/body/1/value/func/_hash=0x0002
/body/1/value/func/_pos=1:1-0-0-
/body/1/value/func/id=print
/body/1/value/func/ctx/_type=Load
/body/1/value/args/_length=1
/body/1/value/args/1/_type=Str
/body/1/value/args/1/_hash=0x0003
/body/1/value/args/1/_pos=1:1-0-1-1-
/body/1/value/args/1/s=hello, world
/body/1/value/keywords/_length=0
/type_ignores/_length=0
Obviously, such a representation is quite verbose. But every single line encapsulates most of the
nesting information we need to understand its context. Since the recursion is linearly encoded,
requesting it does not require recursion any more. Thus,
regular expressions become natural candidates
for the request language1. The current version of Paroxython explores this option.
The regular expression patterns are processed in paroxython.parse_program
on the basis of the
definitions listed in spec.md
.
For instance, the following regular expression matches and localizes the nested function calls
(e.g., foo(bar(42))
) without any recursion:
^(.*) # Capture the path originating from the root...
/_type=Call # ... to the nesting call.
\n(?:\1.+\n)*?\1 # Skip any number of attributes of this call...
/_pos=(?P<POS>.+) # ... until we find and capture its line number.
\n(?:\1.+\n)* \1 # Skip any number of...
/args/.* # ... arguments of this call...
/_type=Call # ... until we find a nested call.
When the first call is found, continuing in the subtree is a simple matter of restricting the search
to the block of lines that share the same prefix (referred by \1
).
That's the general principle. In the next section, we enumerate the tweaks we make to the generation of this text to make it more suitable for regular expression parsing.
Implementation details
On the fly: modification of the AST
Field reordering
Somewhat counterintuitively, in the original AST, several pieces of information pertaining to a
function signature or decoration are rejected after the body of the function. As a result, we
lose the interesting property that the line numbers cannot decrease. We fix this by making body
the last field of a function or class definition
(see issue #4).
Children numbering
A counter is inserted between each node and its children. Starting from 1 enables us to find the last one by matching its number with the (previously captured) length of the list.
Field disambiguation
Accross the AST, the same fields are sometimes used in different contexts. To make it easier or even possible to detect certain features, their usage is disambiguated by applying the following renaming rules:
orelse
: used for both loop and conditionalelse
branches. Replace the former withloopelse
.target
: used for assignment, comprehension, etc. Replace the former withassigntarget
.targets
: used for augmented assignment, deletion, etc. Replace the former withassigntargets
.value
: used for assignment (augmented or not), comprehension, assignment-expressions, etc. Replace the former ones withassignvalue
.
Thus, in the output, there is no more need to search for the context of a field in the previous or following lines.
Added fields
_type
: stores the name of the node (e.g.,Module
,Expr
,Call
)._pos
: stores the line number (extracted from the attributelineno
of the node, if any2), followed by a colon:
, followed by the path from the root of the AST to this node. A path is coded as a sequence of positive integers, hyphen-separated and ended by a hyphen. Its fundamental property is: node n_2 is nested within node n_1 iff the path of n_1 is a prefix of the path of n_2._hash
: see next section.
Expression decontextualization and hashing
In order to detect certain features, such as
swap
:
(a[i], a[j]) = (a[j], a[i])
… we need to recognize a given expression accross its various occurrences. For this, we tweak the
output of the library function ast.dump()
. For instance, in the following statement:
a[i] = 42
… the expression a[i]
is dumped as the string:
"Subscript(value=Name(id='a', ctx=Load()), slice=Index(value=Name(id='i', ctx=Load())), ctx=Store())"
However, in:
return a[i]
… the same expression a[i]
is dumped as the string:
"Subscript(value=Name(id='a', ctx=Load()), slice=Index(value=Name(id='i', ctx=Load())), ctx=Load())"
As you can see, these two strings are not equal, due to the fact that the dump specifies not only
which expressions are used, but how they are used: this is the purpose of the field ctx
. Once
this context is removed, both strings become:
"Subscript(value=Name(id='a'), slice=Index(value=Name(id='i')))"
They can now be represented by the same hash value3, stored in the added field _hash
.
After the fact: modification of the flattened AST
Negative literal simplification
Normally, the AST structure of a numeric literal depends on its sign. For instance, 42 is parsed as:
Expr(value=Num(n=42))
… while -42 is parsed as:
Expr(value=UnaryOp(op=USub(), operand=Num(n=42)))
To unify the further treatment of numbers, the function simplify_negative_literals()
post-processes
the generated flat representation to transform this kind of output:
/body/1/_type=Expr
/body/1/_pos=1:1-
/body/1/value/_type=UnaryOp
/body/1/value/_hash=0x0001
/body/1/value/_pos=1:1-0-
/body/1/value/op/_type=USub
/body/1/value/operand/_type=Num
/body/1/value/operand/_hash=0x0002
/body/1/value/operand/_pos=1:1-0-1-
/body/1/value/operand/n=42
into this one:
/body/1/_type=Expr
/body/1/_pos=1:1-
/body/1/value/_type=Num
/body/1/value/_hash=0x0001
/body/1/value/_pos=1:1-0-
/body/1/value/n=-42
Index simplification
Since Python 3.9, the class ast.Index
is deprecated: simple indexes are represented by their value.
For instance, a[42]
is parsed as:
[...] slice=Constant(value=42) [...]
… instead of:
[...] slice=Index(value=Constant(value=42))) [...]
To unify the further treatment of indexes, the function simplify_indexes()
post-processes the
generated flat representation to transform this kind of output:
/body/1/_type=Expr
/body/1/_pos=1:1-
/body/1/value/_type=Subscript
/body/1/value/_hash=0x0001
/body/1/value/_pos=1:1-0-
/body/1/value/value/_type=Name
/body/1/value/value/_hash=0x0002
/body/1/value/value/_pos=1:1-0-0-
/body/1/value/value/id=a
/body/1/value/value/ctx/_type=Load
/body/1/value/slice/_type=Index # Line to be suppressed.
/body/1/value/slice/value/_type=Num # Lines to be simplified...
/body/1/value/slice/value/_hash=0x0003 # ...
/body/1/value/slice/value/_pos=1:1-0-1-0- # ...
/body/1/value/slice/value/n=42 # ...
/body/1/value/ctx/_type=Load
into this one:
/body/1/_type=Expr
/body/1/_pos=1:1-
/body/1/value/_type=Subscript
/body/1/value/_hash=0x0001
/body/1/value/_pos=1:1-0-
/body/1/value/value/_type=Name
/body/1/value/value/_hash=0x0002
/body/1/value/value/_pos=1:1-0-0-
/body/1/value/value/id=a
/body/1/value/value/ctx/_type=Load
/body/1/value/slice/_type=Num # Simplified lines...
/body/1/value/slice/_hash=0x0003 # ...
/body/1/value/slice/_pos=1:1-0-1- # ... (the last segment of the path is dropped)
/body/1/value/slice/n=42 # ...
/body/1/value/ctx/_type=Load
Suppressing useless lines
- Starting with Python 3.8,
kind
is an attribute of a constant allowing tools to distinguish certain flavors of the strings . Its possible values are"rf"
,"f"
,"u"
and""
in the third-party librarytyped_ast
, but"u"
andNone
in the Python 3.8 standard libraryast
. Since such details are of no interest to us, we don't bother making them consistent. The functionsuppress_kinds()
deletes from the flat AST any line ending withkind=.*
. - Python 3.8 introduces positional-only parameters
to respond to “certain challenges for library authors and users of APIs”. This is far beyond our
pedagogical scope: we are afraid we have no choice but to sell these lines for scientific
experiments (
suppress_posonlyargs()
).
Reverting the representation of constants to the good ol' days
Probably for excellent reasons, Python 3.8 has decided that Num(n)
, Str(s)
, Bytes(s)
,
Ellipsis
and NameConstant(value)
should all be unified under Constant(value)
. As far as we
are concerned, this change makes the type of the constants much harder to determine. Until we
know more, we are restoring the previous representations with backport_all_constants()
.
Unquoting strings
unquote()
suppress the single or double quotes delimiters after =
. This makes for simpler regular
expressions in spec.md
.
Correcting the span of decorated callable
correct_all_decorated_callable_starts()
fixes a bug of typed_ast
in which decorated functions
and classes would incorrectly be marked as starting from the line of their first decorator.
-
Although nowadays, numerous flavours of regex engines support recursion, the complexity of writing, reading and maintaining any pattern using it would quickly go through the roof. ↩
-
Numerous nodes, e.g.
Module
,Store
,Load
,Lt
,Add
, are not associated with a line number. ↩ -
Actually, to avoid dealing with hash randomization in the unit tests, we use a function which merely increments a counter and associates it with any newly encountered string (see
PseudoHashFactory
). ↩
Expand source code Browse GitHub
import ast
import sys
from typing import Any, Callable, Match
import regex # type: ignore
def select_ast_post_processing():
if sys.version_info >= (3, 10):
def post_process(flat_ast):
flat_ast = suppress_kinds(flat_ast)
flat_ast = suppress_alias_pos(flat_ast)
flat_ast = suppress_posonlyargs(flat_ast)
flat_ast = backport_all_constants(flat_ast)
flat_ast = simplify_negative_literals(flat_ast)
flat_ast = unquote(flat_ast)
return flat_ast
elif sys.version_info >= (3, 9):
def post_process(flat_ast):
flat_ast = suppress_kinds(flat_ast)
flat_ast = suppress_posonlyargs(flat_ast)
flat_ast = backport_all_constants(flat_ast)
flat_ast = simplify_negative_literals(flat_ast)
flat_ast = unquote(flat_ast)
return flat_ast
else:
def post_process(flat_ast):
flat_ast = suppress_kinds(flat_ast)
flat_ast = suppress_posonlyargs(flat_ast)
flat_ast = backport_all_constants(flat_ast)
flat_ast = simplify_negative_literals(flat_ast)
flat_ast = simplify_indexes(flat_ast)
flat_ast = unquote(flat_ast)
return flat_ast
return post_process
post_process = select_ast_post_processing()
def flatten_ast(tree) -> str:
pseudo_hash.reset()
flat_ast = flatten_node(tree)
flat_ast = post_process(flat_ast)
return flat_ast
def flatten_node(
node: Any,
prefix: str = "",
path: str = "",
remove_context: Callable = regex.compile(r", ctx=.+?\(\)").sub,
) -> str:
if isinstance(node, ast.AST):
acc = [f"{prefix}/_type={type(node).__name__}\n"]
if isinstance(node, ast.expr):
node_repr = remove_context("", ast.dump(node))
acc.append(f"{prefix}/_hash={pseudo_hash(node_repr)}\n")
if "lineno" in node._attributes:
acc.append(f"{prefix}/_pos={node.lineno}:{path[2:]}\n")
fields = ast.iter_fields(node)
if isinstance(node, (ast.FunctionDef, ast.ClassDef)):
# make `body` the last item of a `def` clause
fields = iter(sorted(fields, key=lambda c: c[0] == "body"))
for (i, (name, x)) in enumerate(fields):
if name == "orelse" and isinstance(node, (ast.For, ast.While, ast.AsyncFor)):
name = "loopelse" # make the `orelse` clause specific to conditionals
elif name == "targets" and isinstance(node, ast.Assign):
name = "assigntargets" # `targets` is also used for `Delete`, etc.
elif name == "target" and isinstance(node, ast.AugAssign):
name = "assigntarget" # `target` is also used for comprehension, etc.
elif name == "value" and isinstance(node, (ast.Assign, ast.AugAssign)):
name = "assignvalue" # `value` is also used for comprehension, etc.
acc.append(flatten_node(x, f"{prefix}/{name}", f"{path}{i}-"))
return "".join(acc)
elif isinstance(node, list):
acc = [f"{prefix}/_length={len(node)}\n"]
for (i, x) in enumerate(node, 1): # number the children from 1
acc.append(flatten_node(x, f"{prefix}/{i}", f"{path}{i}-"))
return "".join(acc)
else:
# If the node is a terminal value, dump it
return f"""{prefix}={repr(node)}\n"""
def simplify_negative_literals(
flat_ast: str,
sub: Callable = regex.compile(
).sub,
) -> str:
return sub(r"\1/_type=Num\2\1/n=-\3", flat_ast)
def simplify_indexes(
flat_ast: str,
subn_index: Callable = regex.compile(r"(?m)^(.*?)/_type=Index\n").subn,
sub_slice_value_pos: Callable = regex.compile(r"(?m)^([^=]*/slice/value/_pos=.+)\d+-").sub,
sub_slice_value: Callable = regex.compile(r"(?m)^([^=]*/slice)/value").sub,
) -> str:
(flat_ast, n) = subn_index("", flat_ast)
if n == 0:
return flat_ast
flat_ast = sub_slice_value_pos(r"\1", flat_ast)
return sub_slice_value(r"\1", flat_ast)
def backport_all_constants(
flat_ast: str,
sub: Callable = regex.compile(
).sub,
) -> str:
return sub(replace_one_constant, flat_ast)
def replace_one_constant(m: Match) -> str:
if m[3].startswith(("'", '"')):
return f"{m[1]}/_type=Str{m[2]}\n{m[1]}/s={m[3]}"
elif m[3] in ("True", "False", "None"):
return f"{m[1]}/_type=NameConstant{m[2]}\n{m[1]}/value={m[3]}"
if m[3].startswith("b'"):
return f"{m[1]}/_type=Bytes{m[2]}\n{m[1]}/s={m[3]}"
if m[3] == "Ellipsis":
return f"{m[1]}/_type=Ellipsis{m[2]}"
else:
return f"{m[1]}/_type=Num{m[2]}\n{m[1]}/n={m[3]}"
def correct_all_decorated_callable_starts(
flat_ast: str,
sub: Callable = regex.compile(
).sub,
) -> str:
return sub(replace_one_decorated_callable_start, flat_ast)
def replace_one_decorated_callable_start(m: Match) -> str:
start = int(m[5]) + int(m[6]) # number of decorators + line number of the first decorator
return f"{m[1]}/_type={m[2]}{m[3]}{start}:{m[4]}"
def unquote(
flat_ast: str,
sub: Callable = regex.compile(r"""=["'](.*)['"]\n""").sub,
) -> str:
return sub(r"=\1\n", flat_ast)
def suppress_kinds(
flat_ast: str,
sub: Callable = regex.compile(r"(?m)^.+/kind=.*\n").sub,
) -> str:
return sub("", flat_ast)
def suppress_alias_pos(
flat_ast: str,
sub: Callable = regex.compile(r"(?m)^(.+/_type=alias\n).+_pos=.+\n").sub,
) -> str:
return sub(r"\1", flat_ast)
def suppress_posonlyargs(
flat_ast: str,
sub: Callable = regex.compile(r"(?m)^.+/args/posonlyargs/_length=\d+\n").sub,
) -> str:
return sub("", flat_ast)
__pdoc__ = {"PseudoHashFactory.__call__": True}
class PseudoHashFactory:
def __init__(self):
self.reset()
def reset(self):
self.i = 0
self.cache = {}
def __call__(self, x):
if x not in self.cache:
self.i += 1
self.cache[x] = f"0x{self.i:04x}"
# 4.294.967.296 distinct expressions per program ought to be enough for anybody.
return self.cache[x]
pseudo_hash = PseudoHashFactory()
if __name__ == "__main__":
Path = __import__("pathlib").Path
source = Path("sandbox/source.py").read_text()
print(flatten_ast(ast.parse(source)))
Functions
def select_ast_post_processing()
-
Customize the post-treatment of the AST depending on the parser used to create it.
Expand source code Browse GitHub
def select_ast_post_processing(): if sys.version_info >= (3, 10): def post_process(flat_ast): flat_ast = suppress_kinds(flat_ast) flat_ast = suppress_alias_pos(flat_ast) flat_ast = suppress_posonlyargs(flat_ast) flat_ast = backport_all_constants(flat_ast) flat_ast = simplify_negative_literals(flat_ast) flat_ast = unquote(flat_ast) return flat_ast elif sys.version_info >= (3, 9): def post_process(flat_ast): flat_ast = suppress_kinds(flat_ast) flat_ast = suppress_posonlyargs(flat_ast) flat_ast = backport_all_constants(flat_ast) flat_ast = simplify_negative_literals(flat_ast) flat_ast = unquote(flat_ast) return flat_ast else: def post_process(flat_ast): flat_ast = suppress_kinds(flat_ast) flat_ast = suppress_posonlyargs(flat_ast) flat_ast = backport_all_constants(flat_ast) flat_ast = simplify_negative_literals(flat_ast) flat_ast = simplify_indexes(flat_ast) flat_ast = unquote(flat_ast) return flat_ast return post_process
def post_process(flat_ast)
-
Expand source code Browse GitHub
def post_process(flat_ast): flat_ast = suppress_kinds(flat_ast) flat_ast = suppress_posonlyargs(flat_ast) flat_ast = backport_all_constants(flat_ast) flat_ast = simplify_negative_literals(flat_ast) flat_ast = simplify_indexes(flat_ast) flat_ast = unquote(flat_ast) return flat_ast
def flatten_ast(tree) ‑> str
-
Return a non-recursive, multiline dump of the AST (with some tweaks).
A simple wrapper around the recursive function
flatten_node()
, with post-processing treatments depending on the AST library (either the standardast
module or the third-partytyped_ast
).Expand source code Browse GitHub
def flatten_ast(tree) -> str: pseudo_hash.reset() flat_ast = flatten_node(tree) flat_ast = post_process(flat_ast) return flat_ast
def flatten_node(node: Any,
prefix: str = '',
path: str = '',
remove_context: Callable = <built-in method sub of regex>
) ‑> str-
Traverse recursively (in pre-order) the given AST node and flatten its subtree.
Args
node
:Any
- The node to traverse. Initially, the whole AST.
prefix
:str
, optional- The prefix of the current line to dump. Defaults to
""
. path
:str
, optional- The path of the current node. Defaults to
""
. remove_context
:Callable
, optional- A function removing the node context encoded in the
result of
ast.dump()
. Not to be explicitly provided. Defaults toregex.compile(r", ctx=.+?\(\)").sub
.
Returns
str
- A flat representation of the given node.
Expand source code Browse GitHub
def flatten_node( node: Any, prefix: str = "", path: str = "", remove_context: Callable = regex.compile(r", ctx=.+?\(\)").sub, ) -> str: if isinstance(node, ast.AST): acc = [f"{prefix}/_type={type(node).__name__}\n"] if isinstance(node, ast.expr): node_repr = remove_context("", ast.dump(node)) acc.append(f"{prefix}/_hash={pseudo_hash(node_repr)}\n") if "lineno" in node._attributes: acc.append(f"{prefix}/_pos={node.lineno}:{path[2:]}\n") fields = ast.iter_fields(node) if isinstance(node, (ast.FunctionDef, ast.ClassDef)): # make `body` the last item of a `def` clause fields = iter(sorted(fields, key=lambda c: c[0] == "body")) for (i, (name, x)) in enumerate(fields): if name == "orelse" and isinstance(node, (ast.For, ast.While, ast.AsyncFor)): name = "loopelse" # make the `orelse` clause specific to conditionals elif name == "targets" and isinstance(node, ast.Assign): name = "assigntargets" # `targets` is also used for `Delete`, etc. elif name == "target" and isinstance(node, ast.AugAssign): name = "assigntarget" # `target` is also used for comprehension, etc. elif name == "value" and isinstance(node, (ast.Assign, ast.AugAssign)): name = "assignvalue" # `value` is also used for comprehension, etc. acc.append(flatten_node(x, f"{prefix}/{name}", f"{path}{i}-")) return "".join(acc) elif isinstance(node, list): acc = [f"{prefix}/_length={len(node)}\n"] for (i, x) in enumerate(node, 1): # number the children from 1 acc.append(flatten_node(x, f"{prefix}/{i}", f"{path}{i}-")) return "".join(acc) else: # If the node is a terminal value, dump it return f"""{prefix}={repr(node)}\n"""
def simplify_negative_literals(flat_ast: str,
sub: Callable = <built-in method sub of regex>
) ‑> str-
Simplify the negative literals of a given flat AST.
Argument
sub
not to be explicitly provided.Expand source code Browse GitHub
def simplify_negative_literals( flat_ast: str, sub: Callable = regex.compile( ).sub, ) -> str: return sub(r"\1/_type=Num\2\1/n=-\3", flat_ast)
def simplify_indexes(flat_ast: str,
subn_index: Callable = <built-in method subn of regex>,
sub_slice_value_pos: Callable = <built-in method sub of regex>,
sub_slice_value: Callable = <built-in method sub of regex>
) ‑> str-
Simplify the indexes of a given flat AST.
Argument
sub
not to be explicitly provided.Expand source code Browse GitHub
def simplify_indexes( flat_ast: str, subn_index: Callable = regex.compile(r"(?m)^(.*?)/_type=Index\n").subn, sub_slice_value_pos: Callable = regex.compile(r"(?m)^([^=]*/slice/value/_pos=.+)\d+-").sub, sub_slice_value: Callable = regex.compile(r"(?m)^([^=]*/slice)/value").sub, ) -> str: (flat_ast, n) = subn_index("", flat_ast) if n == 0: return flat_ast flat_ast = sub_slice_value_pos(r"\1", flat_ast) return sub_slice_value(r"\1", flat_ast)
def backport_all_constants(flat_ast: str,
sub: Callable = <built-in method sub of regex>
) ‑> str-
Restore the more detailed representation of constants used before Python 3.8.
Argument
sub
not to be explicitly provided.Expand source code Browse GitHub
def backport_all_constants( flat_ast: str, sub: Callable = regex.compile( ).sub, ) -> str: return sub(replace_one_constant, flat_ast)
def replace_one_constant(m: Match) ‑> str
-
Define the replacement function used in
backport_all_constants()
.Expand source code Browse GitHub
def replace_one_constant(m: Match) -> str: if m[3].startswith(("'", '"')): return f"{m[1]}/_type=Str{m[2]}\n{m[1]}/s={m[3]}" elif m[3] in ("True", "False", "None"): return f"{m[1]}/_type=NameConstant{m[2]}\n{m[1]}/value={m[3]}" if m[3].startswith("b'"): return f"{m[1]}/_type=Bytes{m[2]}\n{m[1]}/s={m[3]}" if m[3] == "Ellipsis": return f"{m[1]}/_type=Ellipsis{m[2]}" else: return f"{m[1]}/_type=Num{m[2]}\n{m[1]}/n={m[3]}"
def correct_all_decorated_callable_starts(flat_ast: str,
sub: Callable = <built-in method sub of regex>
) ‑> str-
Make the span of a decorated callable starting from
def
, not from the first decorator.Note
The bug is present in the third-party module
typed_ast
, but fixed in Python 3.8's standard libraryast
.Argument
sub
not to be explicitly provided.Expand source code Browse GitHub
def correct_all_decorated_callable_starts( flat_ast: str, sub: Callable = regex.compile( ).sub, ) -> str: return sub(replace_one_decorated_callable_start, flat_ast)
def replace_one_decorated_callable_start(m: Match) ‑> str
-
Define the replacement function used in
correct_all_decorated_callable_starts()
.Expand source code Browse GitHub
def replace_one_decorated_callable_start(m: Match) -> str: start = int(m[5]) + int(m[6]) # number of decorators + line number of the first decorator return f"{m[1]}/_type={m[2]}{m[3]}{start}:{m[4]}"
def unquote(flat_ast: str,
sub: Callable = <built-in method sub of regex>
) ‑> str-
Suppress the quote delimiters after “=“ to simplify the regular expressions of
spec.md
.Argument
sub
not to be explicitly provided.Expand source code Browse GitHub
def unquote( flat_ast: str, sub: Callable = regex.compile(r"""=["'](.*)['"]\n""").sub, ) -> str: return sub(r"=\1\n", flat_ast)
def suppress_kinds(flat_ast: str,
sub: Callable = <built-in method sub of regex>
) ‑> str-
Suppress all leaves containing
/kind=
. They are useless in the following.Argument
sub
not to be explicitly provided.Expand source code Browse GitHub
def suppress_kinds( flat_ast: str, sub: Callable = regex.compile(r"(?m)^.+/kind=.*\n").sub, ) -> str: return sub("", flat_ast)
def suppress_alias_pos(flat_ast: str,
sub: Callable = <built-in method sub of regex>
) ‑> str-
Suppress the AST line following import aliases. Since Python 3.10, it contains a position, useless in the following.
Argument
sub
not to be explicitly provided.Expand source code Browse GitHub
def suppress_alias_pos( flat_ast: str, sub: Callable = regex.compile(r"(?m)^(.+/_type=alias\n).+_pos=.+\n").sub, ) -> str: return sub(r"\1", flat_ast)
def suppress_posonlyargs(flat_ast: str,
sub: Callable = <built-in method sub of regex>
) ‑> str-
Suppress positional-only parameters in function definitions.
Argument
sub
not to be explicitly provided.Expand source code Browse GitHub
def suppress_posonlyargs( flat_ast: str, sub: Callable = regex.compile(r"(?m)^.+/args/posonlyargs/_length=\d+\n").sub, ) -> str: return sub("", flat_ast)
Classes
class PseudoHashFactory
-
A factory that creates a deterministic replacement for the
hash()
builtin function.Expand source code Browse GitHub
class PseudoHashFactory: def __init__(self): self.reset() def reset(self): self.i = 0 self.cache = {} def __call__(self, x): if x not in self.cache: self.i += 1 self.cache[x] = f"0x{self.i:04x}" # 4.294.967.296 distinct expressions per program ought to be enough for anybody. return self.cache[x]
Methods
def reset(self)
-
Restart the global counter and empties the cache.
Expand source code Browse GitHub
def reset(self): self.i = 0 self.cache = {}
def __call__(self, x)
-
Return a 4-digits hexadecimal identifier for the given immutable value.
Expand source code Browse GitHub
def __call__(self, x): if x not in self.cache: self.i += 1 self.cache[x] = f"0x{self.i:04x}" # 4.294.967.296 distinct expressions per program ought to be enough for anybody. return self.cache[x]