How To Train Your Model With Different Datasets(Image and Text)?
Do you think if you feed your model with various data, you can get your model with higher performance? Definitely, yes. Let’s learn together what is modality and how to do it.
Why Do You Need to Try This?
If you visit any online shopping website like Amazon, you may think of developing any machine learning project. I can suggest developing a recommendation system, automated rating products or expecting product prices. The problem here is which one of the data you will pick.
If you choose to take the image only as the most important feature, you face some challenges like many product images aren’t available shown above in the screenshot or the image has poor quality. You try to simulate like the real world. Any customer buys a product depending on the image and description. When the customer can’t read the description or isn’t described well, he has to look at the product image and vice versa. There are many areas you need to use various data, especially any recommendations. This is called modality.
What is Modality?
It’s combined various data like images, text, speech, and time. Each type you feed your model is called modality and you combine all modalities you used in your pipeline in shared…