Font recognition using fast.ai

Jupyter notebook for this exercise can be downloaded here:

Data bundle containing the font catalog used in this exercise is available for download here. The font catalog was generated using the code here. In this notebook we take two fonts Lato-Hairline and Purisa; we then try train the machine to differentiate between the two.

Here we go:

%matplotlib inline
from fastai import *
from fastai.vision import *
np.random.seed(77)

import glob
mypath = '/home/jupyter/tutorials/data/fontcatalog2'
fnames = glob.glob(f'{mypath}/Lato-Hairline') fnames.extend(glob.glob(f'{mypath}/Purisa'))

pat = r'\/(\w+[^-\/_])-?\w+.png'
sz = 175
tfms = get_transforms(do_flip=False)
data = ImageDataBunch.from_name_re(Path(mypath), fnames, pat, valid_pct=.2,
                                    ds_tfms=tfms, size=sz).normalize(imagenet_stats)
data.show_batch(rows=3, figsize=(7,6))
print(data.classes)

We get:

['Lato', 'Purisa']

train, valid = [], []
count_train, count_valid, count_grand = 0, 0, 0
for i in range(len(data.train_ds)):
    train.append(f'{data.train_ds.y[i]}'.split()[0])
for i in range(len(data.valid_ds)):
    valid.append(f'{data.valid_ds.y[i]}'.split()[0])
train = np.asarray(train)
valid = np.asarray(valid)

learn = create_cnn(data, models.resnet18, metrics=error_rate)
learn.fit_one_cycle(1)

learn.unfreeze()
learn.lr_find()
learn.recorder.plot()

learn.fit_one_cycle(1, max_lr=slice(2e-4,3e-4))
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(5,5))
wrongs = 0
for t in interp.most_confused():
    wrongs += t[2]
print(wrongs, 'wrong')

Between the two fonts Lato and Purisa our machine successfully learned so well that it didn’t make even a single mistake!

Learning!

Font recognition using fast.ai

Leave a Reply Cancel reply

Metric: mutual info

Metric: silhoutte score

Metrics: homogeneity score, completeness score, v measure

Metric: Fowlkes-Mallows score

Metric: entropy

Metric: Davies-Bouldin index

Metric: Calinski-Harabasz index

Metric: adjusted rand score