How to use the sidekit.FeaturesExtractor function in SIDEKIT

To help you get started, we’ve selected a few SIDEKIT examples, based on popular ways it is used in public projects.

Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.

github Anwarvic / Speaker-Recognition / extract_features.py View on Github external
# Feature extraction
        # lower_frequency: lower frequency (in Herz) of the filter bank
        # higher_frequency: higher frequency of the filter bank
        # filter_bank: type of fiter scale to use, can be "lin" or "log"
        #              (for linear of log-scale)
        # filter_bank_size: number of filters banks
        # window_size: size of the sliding window to process (in seconds)
        # shift: time shift of the sliding window (in seconds)
        # ceps_number: number of cepstral coefficients to extract
        # snr: signal to noise ratio used for "snr" vad algorithm
        # vad: Type of voice activity detection algorithm to use.
        #      It Can be "energy", "snr", "percentil" or "lbl".
        # save_param: list of strings that indicate which parameters to save. 
        # keep_all_features: boolean, if True, all frames are writen; if False,
        #                    keep only frames according to the vad label
        extractor = sidekit.FeaturesExtractor(
            audio_filename_structure=os.path.join(self.audio_dir, group, "{}"),
            feature_filename_structure=os.path.join(feat_dir, "{}.h5"),
            lower_frequency=self.LOWER_FREQUENCY,
            higher_frequency=self.HIGHER_FREQUENCY,
            filter_bank=self.FILTER_BANK,
            filter_bank_size=self.FILTER_BANK_SIZE,
            window_size=self.WINDOW_SIZE,
            shift=self.WINDOW_SHIFT,
            ceps_number=self.CEPS_NUMBER,
            vad=self.VAD,
            snr=self.SNR_RATIO,
            save_param=self.FEAUTRES,
            keep_all_features=True)

        # Prepare file lists
        # show_list: list of IDs of the show to process
github stdm / ZHAW_deep_voice / common / spectrogram / Ivec_feature_extractor.py View on Github external
def extract_speaker_data(self):

        speaker_names = []
        file_names = []
        speaker_ids = []
        curr_speaker_num = -1
        old_speaker = ''

        fe = sidekit.FeaturesExtractor(audio_filename_structure="{}.wav",
                               feature_filename_structure="common/data/training/i_vector/"+self.speaker_list+"/feat/{}.h5",
                               sampling_frequency=None,
                               lower_frequency=200,
                               higher_frequency=3800,
                               filter_bank="log",
                               filter_bank_size=24,
                               window_size=0.025,
                               shift=0.01,
                               ceps_number=20,
                               vad="snr",
                               snr=40,
                               pre_emphasis=0.97,
                               save_param=["vad", "energy", "cep"],
                               keep_all_features=True)

        # Crawl the base and all sub folders