Todo
Giving attention to transformers
Generative pre-trained transformer
Language Models are Few-Shot Learners
The task of searching through a large number of messages, such as in a slack channel, can be difficult and time-consuming when trying to find a specific piece of information, such as a documentation link. One approach to make this task easier is to use a language model like GPT-3 to generate a fake answer to the question, such as "the documentation link is at https:...", and then search for that specific answer within the messages. This can be made even more effective by using a technique called embedding, which involves taking the text of the messages and running it through a part of the language model in order to create a "meaning hash" or a numerical representation of the text that retains much of the original meaning. This allows for searching based on the meaning of the text rather than the specific characters. The embedding is similar to storing a hash of the text in a search database, but the embedding process uses a language model to ensure that much of the meaning of the text is retained.
Dense retrieval models similarity between query and document with inner product similarity. Given a query q and document d, it uses two encoder function encq and encd to map them into d dimension vectors vq, vd, whose inner product is used as similarity measurement.
Vector similarity: Cosine Similarity
In zero-shot retrieval, we have a set of queries Q1, Q2, ..., QL and corresponding sets of documents D1, D2, ..., DL. The task is to find a way to match the queries with the relevant documents without using any previously labeled data (i.e. no labeled query-document pairs that indicate which documents are relevant to which queries). The challenge is in finding a way to map the queries and documents into a common space where their similarity can be measured using an inner product, without any labeled data to guide this mapping.
It learns this embedding space using unsupervised contrastive learning. This means that it learns how to map documents to a vector representation in a way that similar documents will have similar representations. This is done by using a function called "enccon" which will be shared by all incoming document corpora.
To search for relevant documents for a given query, HyDE uses an instruction following language model (InstructLM) that takes a query and a textual instruction (INST), and generates a "hypothetical" document that is likely to be relevant to the query. The idea is that this document will capture the relevance pattern of the query, even though it is not a real document and may not be factually accurate. This way, HyDE offloads the task of relevance modeling from the representation learning model to a natural language generation model, which generalizes more easily and effectively.
Hypothetical Document Embeddings
Precise Zero-Shot Dense Retrieval without Relevance Labels
We use a key-value store to track entity facts over time. For new input, we: