Indian Ethnicity Classifier using Deep Learning
I trained the model with images of different ethnic people using BING API
- Importing Fast AI
 - We could use this function by providing maximum images to download and the end point of the API
 - DataLoader : DataLoaders: A fastai class that stores multiple DataLoader objects you pass to it, normally a train and a valid, although it's possible to have as many as you like. The first two are made available as properties. 
- Now using transfer learning technique in resnet18 architecture With fast AI we are using so much less lines of code.
 - The accuracy is not great the error rate is around 32 % but we will clean aur data and get better results
 - With the help of fast AI functions we would clean our dataset
 - Now we will re train our model with new dataset.
 - As we could see our accuracy has very much increased and error rate is dropped to 23%
 - Now we export our model for use
 - We will create a GUI for a small application to use this model on Notebook with the help of IPython widgets (ipywidgets) and Voilà
 - With the help of IPython widgets we made a simple gui to implement our result when we click on upload a box will appear to input image and when we press classify it prints our result
 
 
First we have to import the libraries needed for this project. 
Here we are using fast AI which is based on Pytorch
from fastbook import *
from fastai.vision.widgets import *
key = os.environ.get('AZURE_SEARCH_KEY', 'XXX')
 def search_images_bing(key, term, max_images: int = 100, **kwargs):    
     params = {'q':term, 'count':max_images}
     headers = {"Ocp-Apim-Subscription-Key":key}
     search_url = "https://api.bing.microsoft.com/v7.0/images/search"
     response = requests.get(search_url, headers=headers, params=params)
     response.raise_for_status()
     search_results = response.json()    
     return L(search_results['value'])
    
ethnicGroups = 'indo-aryan','dravidian','indian mongloid'
path = Path('ethnicGroups')
if not path.exists():
    path.mkdir()
for o in ethnicGroups:
    dest = (path/o)
    dest.mkdir(exist_ok=True)
    results = search_images_bing(key, f'{o} people')
    download_images(dest, urls=results.attrgot('contentUrl'))
fns = get_image_files(path)
fns
failed = verify_images(fns)
failed
failed.map(Path.unlink);
Now we have to make a dataloader.
DataLoader : DataLoaders: A fastai class that stores multiple DataLoader objects you pass to it, normally a train and a valid, although it's possible to have as many as you like. The first two are made available as properties.
[^1] Definition from Jeremy Book "Deep Learning for coders".
groups = DataBlock(
    blocks=(ImageBlock, CategoryBlock), 
    get_items=get_image_files, 
    splitter=RandomSplitter(valid_pct=0.2, seed=45),
    get_y=parent_label,
    item_tfms=Resize(128))
dls = groups.dataloaders(path)
dls.valid.show_batch(max_n=4, nrows=1)
As we can see above the images are not correct so we have to clean our image dataset. If you could manually get some image dataset the result would be so much better.
We are using data augmentation for better accuracy. 
 
 Data augmentation refers to creating random variations of our input data, such that they appear different, but do not actually change the meaning of the data. Examples of common data augmentation techniques for images are rotation, flipping, perspective warping, brightness changes and contrast changes 
[^1] Definition from Jeremy Book "Deep Learning for coders".*
groups = groups.new(item_tfms=Resize(128), batch_tfms=aug_transforms(mult=2))
dls = groups.dataloaders(path)
dls.train.show_batch(max_n=8, nrows=2, unique=True)
Now using bigger size image for better result.
groups = groups.new(
    item_tfms=RandomResizedCrop(224, min_scale=0.5),
    batch_tfms=aug_transforms())
dls = groups.dataloaders(path)
learn = cnn_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix()
interp.plot_top_losses(5, nrows=1)
cleaner = ImageClassifierCleaner(learn)
cleaner
for idx in cleaner.delete(): cleaner.fns[idx].unlink()
for idx,cat in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/cat)
groups = groups.new(
    item_tfms=RandomResizedCrop(224, min_scale=0.5),
    batch_tfms=aug_transforms())
dls = groups.dataloaders(path)
learn = cnn_learner(dls, resnet18, metrics=error_rate)
learn.fine_tune(4)
learn.export()
path = Path()
path.ls(file_exts='.pkl')
learn_inf = load_learner(path/'export.pkl')
btn_upload = widgets.FileUpload()
btn_upload
img = PILImage.create(btn_upload.data[-1])
out_pl = widgets.Output()
out_pl.clear_output()
with out_pl: display(img.to_thumb(128,128))
out_pl
pred,pred_idx,probs = learn_inf.predict(img)
lbl_pred = widgets.Label()
lbl_pred.value = f'Prediction: {pred}; Probability: {probs[pred_idx]:.04f}'
lbl_pred
btn_run = widgets.Button(description='Classify')
btn_run
def on_click_classify(change):
    img = PILImage.create(btn_upload.data[-1])
    out_pl.clear_output()
    with out_pl: display(img.to_thumb(128,128))
    pred,pred_idx,probs = learn_inf.predict(img)
    lbl_pred.value = f'Prediction: {pred}; Probability: {100* probs[pred_idx]:.02f}%'
btn_run.on_click(on_click_classify)
VBox([widgets.Label('Select image to check ethnicity!'), 
      btn_upload, btn_run, out_pl, lbl_pred])