How to use the fastparquet.api.filter_out_stats function in fastparquet

To help you get started, we’ve selected a few fastparquet examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github dask / dask / dask / dataframe / io / parquet.py View on Github external
"do not contain a global '_metadata' file"
        )

    (
        meta,
        filters,
        index_name,
        out_type,
        all_columns,
        index_names,
        storage_name_mapping,
    ) = _pf_validation(pf, columns, index, categories, filters)
    rgs = [
        rg
        for rg in pf.row_groups
        if not (fastparquet.api.filter_out_stats(rg, filters, pf.schema))
        and not (fastparquet.api.filter_out_cats(rg, filters))
    ]

    name = "read-parquet-" + tokenize(fs_token, paths, all_columns, filters, categories)
    dsk = {
        (name, i): (
            _read_parquet_row_group,
            fs,
            pf.row_group_filename(rg),
            index_names,
            all_columns,
            rg,
            out_type == Series,
            categories,
            pf.schema,
            pf.cats,