Structuring of the domain whisky varieties with the help of text mining
This thesis addresses the challenge of whisky recommendations. Its goal is to create the basis for a fictional whisky recommendation system. Specially, the goal of this thesis is to create a machine-readable representation of knowledge about the distances of whiskies regarding their flavour. For this purpose, a KDD process is defined, which is tailored to the domain and the existing raw data. This includes the use of word embeddings. Different techniques from the field of text mining are applied in the process. A special focus lies on pre-processing the raw text data to improve the resulting word embeddings. After generating representations of whiskies that allow distance calculations, recommendations can be made based on those distances. Finally, these recommendations are assessed in terms of their comprehensibility by several experts as part of the evaluation.