How to use the awswrangler.pandas.read_csv function in awswrangler

To help you get started, we’ve selected a few awswrangler examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github awslabs / aws-data-wrangler / awswrangler / pandas.py View on Github external
:param thousands: Same as pandas.read_csv()
        :param decimal: Same as pandas.read_csv()
        :param lineterminator: Same as pandas.read_csv()
        :param quotechar: Same as pandas.read_csv()
        :param quoting: Same as pandas.read_csv()
        :param escapechar: Same as pandas.read_csv()
        :param parse_dates: Same as pandas.read_csv()
        :param infer_datetime_format: Same as pandas.read_csv()
        :param encoding: Same as pandas.read_csv()
        :param converters: Same as pandas.read_csv()
        :return: Pandas Dataframe
        """
        buff = BytesIO()
        client_s3.download_fileobj(Bucket=bucket_name, Key=key_path, Fileobj=buff)
        buff.seek(0),
        dataframe = pd.read_csv(
            buff,
            header=header,
            names=names,
            usecols=usecols,
            sep=sep,
            thousands=thousands,
            decimal=decimal,
            quotechar=quotechar,
            quoting=quoting,
            escapechar=escapechar,
            parse_dates=parse_dates,
            infer_datetime_format=infer_datetime_format,
            lineterminator=lineterminator,
            dtype=dtype,
            encoding=encoding,
            converters=converters,
github awslabs / aws-data-wrangler / awswrangler / pandas.py View on Github external
sep=sep,
                                                        quoting=quoting,
                                                        quotechar=quotechar,
                                                        lineterminator=lineterminator)
                    forgotten_bytes = len(body[last_char:])
                elif count == bounders_len:  # Last chunk
                    last_char = chunk_size
                else:
                    last_char = Pandas._find_terminator(body=body,
                                                        sep=sep,
                                                        quoting=quoting,
                                                        quotechar=quotechar,
                                                        lineterminator=lineterminator)
                    forgotten_bytes = len(body[last_char:])

                df = pd.read_csv(StringIO(body[:last_char].decode("utf-8")),
                                 header=header,
                                 names=names,
                                 usecols=usecols,
                                 sep=sep,
                                 thousands=thousands,
                                 decimal=decimal,
                                 quotechar=quotechar,
                                 quoting=quoting,
                                 escapechar=escapechar,
                                 parse_dates=parse_dates,
                                 infer_datetime_format=infer_datetime_format,
                                 lineterminator=lineterminator,
                                 dtype=dtype,
                                 encoding=encoding,
                                 converters=converters)
                yield df