How to use the reynir.bintokenizer.describe_token function in reynir

To help you get started, we’ve selected a few reynir examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github mideind / Greynir / treeutil.py View on Github external
t.corr contains explanatory text if a correction has been applied

            This function has the side effect of filling in the words dictionary
            with (stem, cat) keys and occurrence counts.

        """

        # Map tokens to associated terminals, if any
        # tmap is an empty dict if there's no parse tree
        tmap = TreeUtility._terminal_map(tree)
        dump = []
        for ix, token in enumerate(tokens):
            # We have already cut away paragraph and sentence markers
            # (P_BEGIN/P_END/S_BEGIN/S_END)
            terminal, meaning = tmap.get(ix, (None, None))
            d = describe_token(ix, token, terminal, meaning)
            if words is not None:
                wt = TreeUtility._word_tuple(token, terminal, meaning)
                if wt is not None:
                    # Add the (stem, cat) combination to the words dictionary
                    words[wt] += 1
            if ix == error_index:
                # Mark the error token, if present
                d["err"] = 1
            if meaning is not None and "x" in d:
                # Also return the augmented terminal name
                d["a"] = augment_terminal(
                    terminal.name, d["x"].lower(), meaning.beyging
                )
            dump.append(d)
        return dump