A Semi-Manual Data Labeling Method
One of the most difficult tasks in the area of Machine Learning is the Data labeling. Considering the case of Object Detection, this would gigantically increase the manual load as each, and every object of interest has to be labelled and also precisely labelled, which not only consumes a considerable amount of time but also tests one’s patience!
The blog by Vikas Kumar Ojha on Classification of Unlabeled Images is a very good approach to start with, but it works well mainly in the case of single object scenario; meaning there is only one object in the image. As we plan to tackle the challenge with a multiple-object scenario, an alternative plan is to be thought of.
So basically, a model is usually trained on labelled data and is evaluated in terms of performance and consistency, and many authors come up with statements that ‘their’ model show very promising results. Well, if any model shows this ‘promising’ results, it can very well help us in labeling the images !!
Boooom !! So, we plan to exploit the model itself to label objects in the image. Now, the next query that arises, is what type of model are we going to select? There are several models and algorithms in the case of the Object Detection but, we preferred the YOLOs as it has a simpler approach where the labelled information such as the Label ID and the Bounding BOX (BBOX) coordinates are saved in a text (.txt) file. Importantly, the YOLOs are pretrained on the COCO-datasets which consist of about 80 classes. The GitHub link for the YOLO model can be found here.
There are different algorithms which can be used as well with one such being the PASCAL-VOC which unlike the YOLOs saves the information in an .xml file.
The algorithm and the model are chosen and now, in order to visualize the results, a particular toolkit can be made use of. Indeed, there are many toolkits available for labelling, but we have chosen the ‘LabelImg’ toolkit here. The installation of the toolkit is given in the following GitHub link. As the basic requirements and queries are sorted out, lets now construct this idea by means of a flow chart.
PROPOSED PROCEDURE
So, let's go through the above procedures, one by one:
(a) Unlabeled data: This is the data (which in our case are the images) that is to be labelled.
(b) Use Transfer Learning: The Transfer Learning is the stage where a pretrained model (which is the YOLOs in our case) labels the objects present in the Unlabeled data.
(c) Bounding Boxes & Label IDs: This stage saves the information of the detected objects such as the bounding box (coordinates) and the Label ID in the form of .txt file.
(d) Addition: Here the stages described in (a) and (c) are combined in a single folder and analyzed using the ‘LabelImg’ toolkit for any modifications if required.
The modification stage implies some minor adjustments in the bounding box or a situation like an additional object of interest, if is to be labeled. The information of the additional object such as the Bounding Box coordinates and the Label ID would automatically be recorded in the corresponding text file!
It is also important to note that the modification stage does involve some manual work, but this manual load is quite lesser compared to that of the traditional labeling methods. This is one reason why the proposed procedure is said to be ‘Semi-Manual’ Labeling.
(e) Training Phase of Custom Data: Finally, once the Unlabeled data are labeled (by the model and with modifications) the training stage can commence! This can be done using the YOLO model too but would require some pre-processing, like creating a .yaml file which, Luiz doleron ’s blog on Training YOLOv5 custom dataset with ease provides more information on it.
(f) Save the model Weights: So once, the training is done, Voillaaa !! The training weights (.pth file) can be used for labeling for other similar data of interest!
QUALITATIVE RESULTS
It can be observed from the image above, the bounding box does work great, but not to the complete extent as, the beak portion is not completely covered up. This thereby can be adjusted with the help of the toolkit. This further can be argued that there is still some manual work involved here, but the work is cut-shorted thanks to the bounding box which is already drawn by the pre-trained model.
CONCLUSION
The above would work well with multiple objects labeling on the datasets too. Not to mention again, as other objects of interests have to be labeled manually, the load would be comparatively less compared to the naïve dataset labelling which is why this approach is considered to be a ‘Semi-Manual’ One.
But one of the issues with this method is the fact that in a dense object-clustered datasets, this method would still be in par-with that of the naïve labelling methods.
On the other hand, this approach can be extended to various other applications too, with some modifications in the flow charts, if necessary.
This is perhaps my first blog here. I would really appreciate any suggestion from the readers regarding the context or even the flow of written text, which would help me to understand and write the future blogs in a better way.