How to use the pdfplumber.utils.extract_text function in pdfplumber

To help you get started, we’ve selected a few pdfplumber examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github jsvine / pdfplumber / pdfplumber / page.py View on Github external
def extract_text(self,
        x_tolerance=utils.DEFAULT_X_TOLERANCE,
        y_tolerance=utils.DEFAULT_Y_TOLERANCE):

        return utils.extract_text(self.chars,
            x_tolerance=x_tolerance,
            y_tolerance=y_tolerance)
github jsvine / pdfplumber / pdfplumber / table.py View on Github external
)

        for row in self.rows:
            arr = []
            row_chars = [ char for char in chars
                if char_in_bbox(char, row.bbox) ]

            for cell in row.cells:
                if cell == None:
                    cell_text = None
                else:
                    cell_chars = [ char for char in row_chars
                        if char_in_bbox(char, cell) ]

                    if len(cell_chars):
                        cell_text = utils.extract_text(cell_chars,
                            x_tolerance=x_tolerance,
                            y_tolerance=y_tolerance).strip()
                    else:
                        cell_text = ""
                arr.append(cell_text)
            table_arr.append(arr)

        return table_arr