How to use the datasketch.hashfunc.sha1_hash32 function in datasketch

To help you get started, we’ve selected a few datasketch examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github ekzhu / datasketch / datasketch / hyperloglog.py View on Github external
    def __init__(self, p=8, reg=None, hashfunc=sha1_hash32, hashobj=None):
        if reg is None:
            self.p = p
            self.m = 1 << p
            self.reg = np.zeros((self.m,), dtype=np.int8)
        else:
            # Check if the register has the correct type
            if not isinstance(reg, np.ndarray):
                raise ValueError("The imported register must be a numpy.ndarray.")
            # We have to check if the imported register has the correct length.
            self.m = reg.size
            self.p = _bit_length(self.m) - 1
            if 1 << self.p != self.m:
                raise ValueError("The imported register has \
                    incorrect size. Expect a power of 2.")
            # Generally we trust the user to import register that contains
            # reasonable counter values, so we don't check for every values.
github ekzhu / datasketch / datasketch / minhash.py View on Github external
def __init__(self, num_perm=128, seed=1,
            hashfunc=sha1_hash32,
            hashobj=None, # Deprecated.
            hashvalues=None, permutations=None):
        if hashvalues is not None:
            num_perm = len(hashvalues)
        if num_perm > _hash_range:
            # Because 1) we don't want the size to be too large, and
            # 2) we are using 4 bytes to store the size value
            raise ValueError("Cannot have more than %d number of\
                    permutation functions" % _hash_range)
        self.seed = seed
        # Check the hash function.
        if not callable(hashfunc):
            raise ValueError("The hashfunc must be a callable.")
        self.hashfunc = hashfunc
        # Check for use of hashobj and issue warning.
        if hashobj is not None: