Fashion discovery : a computer vision approach

Rubio Romano, Antonio

Fashion discovery : a computer vision approach

dc.contributor

Institut de Robòtica i Informàtica Industrial

dc.contributor.author

Rubio Romano, Antonio

dc.date.accessioned

2021-09-17T09:39:43Z

dc.date.available

2021-09-17T09:39:43Z

dc.date.issued

2021-07-23

dc.identifier.uri

http://hdl.handle.net/10803/672423

dc.description

Pla de Doctorats Industrials

dc.description.abstract

Performing semantic interpretation of fashion images is undeniably one of the most challenging domains for computer vision. Subtle variations in color and shape might confer different meanings or interpretations to an image. Not only is it a domain tightly coupled with human understanding, but also with scene interpretation and context. Being able to extract fashion-specific information from images and interpret that information in a proper manner can be useful in many situations and help understanding the underlying information in an image. Fashion is also one of the most important businesses around the world, with an estimated value of 3 trillion dollars and a constantly growing online market, which increases the utility of image-based algorithms to search, classify or recommend garments. This doctoral thesis aims to solve specific problems related with the treatment of fashion e-commerce data, from low-level pure pixel information to high-level abstract conclusions of the garments appearing in an image, taking advantage of the multi-modality of the available data for developing some of the solutions. The contributions include: - A new superpixel extraction method focused on improving the annotation process for clothing images. - The construction of an image and text embedding for fashion data. - The application of this embedding space to the task of retrieving the main product in an image showing a complete outfit. In summary, fashion is a complex computer vision and machine learning problem at many levels, and developing specific algorithms that are able to capture essential information from pictures and text is not trivial. In order to solve some of the challenges it proposes, and taking into account that this is an Industrial Ph.D., we contribute with a variety of solutions that can boost the performance of many tasks useful for the fashion e-commerce industry.

dc.description.abstract

La interpretación semántica de imágenes del mundo de la moda es sin duda uno de los dominios más desafiantes para la visión por computador. Leves variaciones en color y forma pueden conferir significados o interpretaciones distintas a una imagen. Es un dominio estrechamente ligado a la comprensión humana subjetiva, pero también a la interpretación y reconocimiento de escenarios y contextos. Ser capaz de extraer información específica sobre moda de imágenes e interpretarla de manera correcta puede ser útil en muchas situaciones y puede ayudar a entender la información subyacente en una imagen. Además, la moda es uno de los negocios más importantes a nivel global, con un valor estimado de tres trillones de dólares y un mercado online en constante crecimiento, lo cual aumenta el interés de los algoritmos basados en imágenes para buscar, clasificar o recomendar prendas. Esta tesis doctoral pretende resolver problemas específicos relacionados con el tratamiento de datos de tiendas virtuales de moda, yendo desde la información más básica a nivel de píxel hasta un entendimiento más abstracto que permita extraer conclusiones sobre las prendas presentes en una imagen, aprovechando para ello la Multi-modalidad de los datos disponibles para desarrollar algunas de las soluciones. Las contribuciones incluyen: - Un nuevo método de extracción de superpíxeles enfocado a mejorar el proceso de anotación de imágenes de moda. - La construcción de un espacio común para representar imágenes y textos referentes a moda. - La aplicación de ese espacio en la tarea de identificar el producto principal dentro de una imagen que muestra un conjunto de prendas. En resumen, la moda es un dominio complejo a muchos niveles en términos de visión por computador y aprendizaje automático, y desarrollar algoritmos específicos capaces de capturar la información esencial a partir de imágenes y textos no es una tarea trivial. Con el fin de resolver algunos de los desafíos que esta plantea, y considerando que este es un doctorado industrial, contribuimos al tema con una variedad de soluciones que pueden mejorar el rendimiento de muchas tareas extremadamente útiles para la industria de la moda online

dc.format.extent

114 p.

dc.format.mimetype

application/pdf

dc.language.iso

eng

dc.publisher

Universitat Politècnica de Catalunya

dc.rights.license

L'accés als continguts d'aquesta tesi queda condicionat a l'acceptació de les condicions d'ús establertes per la següent llicència Creative Commons: http://creativecommons.org/licenses/by-sa/4.0/

dc.rights.uri

http://creativecommons.org/licenses/by-sa/4.0/

dc.source

TDX (Tesis Doctorals en Xarxa)

dc.subject

Image retrieval

dc.subject

Multimodal embedding

dc.subject

Superpixels

dc.subject

Siamese neural networks

dc.subject

Fashion

dc.subject.other

Àrees temàtiques de la UPC::Informàtica

dc.title

Fashion discovery : a computer vision approach

dc.type

info:eu-repo/semantics/doctoralThesis

dc.type

info:eu-repo/semantics/publishedVersion

dc.subject.udc

004

dc.contributor.director

Moreno Noguer, Francesc

dc.contributor.codirector

Simó Serra, Edgar

dc.embargo.terms

cap

dc.rights.accessLevel

info:eu-repo/semantics/openAccess

dc.identifier.doi

https://dx.doi.org/10.5821/dissertation-2117-351670

dc.description.degree

Automàtica, robòtica i visió

dc.description.degree

DOCTORAT EN AUTOMÀTICA, ROBÒTICA I VISIÓ (Pla 2013)

Documents

TARR1de1.pdf

47.98Mb PDF

This item appears in the following Collection(s)

Programa de Doctorat en Automàtica, Robòtica i Visió [140]