Secure your code as it's written. Use Snyk Code to scan source code in minutes - no build needed - and fix issues immediately.
def calc_dendrograms(HC, D, linkage_type='single'):
linkage_types = { 'single' : fc.single,
'complete' : fc.complete,
'average' : fc.average,
'weighted' : fc.weighted,
'centroid' : fc.centroid,
'median' : fc.median,
'ward' : fc.ward };
T = {};
print "Calculating linkages. This will take a while!";
nk = len(D.keys());
for (i, dk) in enumerate(D.keys()):
print "\r%d/%d" % (i+1, nk),
sys.stdout.flush();
L = linkage_types[linkage_type](D[dk]);
T[dk] = construct_dendrogram(L, HC[dk]);
#efor
return T;
The input nodes are labeled 0, . . . , N - 1, and the newly generated nodes have the labels N, . . . , 2N - 2.
The third column contains the distance between the two nodes at each step, ie. the
current minimal distance at the time of the merge. The fourth column counts the
number of points which comprise each new node.
:param pairwise_estimates: dictionary with data frames with pairwise estimates of Ks, Ka and Ka/Ks
(or at least Ks), as returned by :py:func:`analyse_family`.
:return: average linkage clustering as performed with ``fastcluster.average``.
"""
if pairwise_estimates is None:
return None
if pairwise_estimates['Ks'].shape[0] < 2:
return None
clustering = fastcluster.average(pairwise_estimates['Ks'])
return clustering
distance between the two nodes at each step, ie. the current minimal
distance at the time of the merge. The fourth column counts the number of
points which comprise each new node.
:param pairwise_estimates: dictionary with data frames with pairwise
estimates of Ks, Ka and Ka/Ks (or at least Ks), as returned by
:py:func:`analyse_family`.
:return: average linkage clustering as performed with
``fastcluster.average``.
"""
# fill NaN values with something larger than all the rest, not a foolproof
# approach, but should be reasonable in most cases
if np.any(np.isnan(pairwise_estimates)):
logging.warning("Ks matrix contains NaN values, replaced with 1000")
pairwise_estimates.fillna(1000, inplace=True)
clustering = fastcluster.average(pairwise_estimates)
return clustering