Lecture 19 Question Answering

文章介绍了问答系统的主要方法,包括基于信息检索的问答,它依赖于查询相关文档来提取答案;知识库问答,通过查询结构化知识库来获取答案;以及混合方法,结合两者的优势。深度学习模型如BERT在阅读理解任务中表现优越,但需要大量数据训练。
摘要由CSDN通过智能技术生成

introduction

  • Definition: question answering (“QA”) is the task of automatically determining the answer for a natural language question
  • Mostly focus on “factoid” questions
  • factoid question(not ambiguious)
    • Factoid questions, have short precise answers:
      • What war involved the battle of Chapultepec?
      • What is the date of Boxing Day?
      • What are some fragrant white climbing roses?
      • What are tannins?
  • non-factoid question
    • General non-factoid questions require a longer answer, critical analysis, summary, calculation and more:
      • Why is the date of Australia Day contentious?
      • What is the angle 60 degrees in radians?
  • why focus on factoid questions
    • They are easier
    • They have an objective answer
    • Current NLP technologies cannot handle non-factoid answers(under developing)
  • 2 key approaches
    • Information retrieval-based QA
      • Given a query, search relevant documents
      • extract answers within these relevant documents
    • Knowledge-based QA
      • Builds semantic representation of the query
      • Query database of facts to find answers

IR-based QA (dominant approach)

  • IR-based factoid QA: TREC-QA 在这里插入图片描述

    1. Use question to make query for IR engine
      • query formulation: extract key terms to search for documents in the database
      • answer type detection: guess what type of answer is the question looking for
    2. Find document, and passage(段,章) within document(find most relevant passages)
    3. Extract short answer string(use relevant passages and answer type information)
  • question processing

    • Find key parts of question that will help retrieval
      • Discard non-content words/symbols (wh-word, ?, etc)
      • Formulate as tf-idf query, using unigrams or bigrams
      • Identify entities and prioritise match
    • May reformulate question using templates
      • E.g. “Where is Federation Square located?”
      • Query = “Federation Square located”
      • Query = “Federation Square is located [in/at]”
    • Predict expected answer type (here = LOCATION)
  • answer types

    • Knowing the type of answer can help in:
      • finding the right passage containing the answer
      • finding the answer string
    • Treat as classification (a closed set of answer types)
      • given question, predict answer type
      • key feature is question headword
      • What are the animals on the Australian coat of arms?
      • Generally not a difficult task 在这里插入图片描述在这里插入图片描述
  • retrieval

    • Find top n documents matching query (standard IR)
    • Next find passages (paragraphs or sentences) in these documents (also driven by IR)
    • a good passage should contain:
      • many instances of the question keywords
      • several named entities of the answer type
      • close proximity(靠近) of these terms in the passage
      • high ranking by IR engine
    • Re-rank IR outputs to find best passage (e.g., using supervised learning)
  • answer extraction

    • Find a concise answer to the question, as a span in the passage

      • “Who is the federal MP for Melbourne?”
      • The Division of Melbourne is an Australian Electoral Division in Victoria, represented since the 2010 election by Adam Bandt, a member of the Greens.
      • “How many Australian PMs have there been since 2013?”
      • Australia has had five prime ministers in five years. No wonder Merkel needed a cheat sheet at the G-20.
    • how?

      • Use a neural network to extract answer
      • AKA reading comprehension task(assuming query and evidence passage are given, and find the span)
      • But deep learning models require lots of data
      • Do we have enough data to train comprehension models?
    • dataset

      • MCTest(a dataset)

        • Crowdworkers write fictional stories, questions and answers
        • 500 stories, 2000 questions
        • Multiple choice questions 在这里插入图片描述
      • SQuAD

        • Use Wikipedia passages(easier to create than MCTest)
        • First set of crowdworkers create questions (given passage)
        • Second set of crowdworkers label the answer
        • 150K questions (!)
        • Second version includes unanswerable questions(no answer in the passage) 在这里插入图片描述
    • reading comprehension

      • Given a question and context passage, predict where the answer span starts and end in passage?

      • Compute:

        • P s t a r t ( i ) P_{start}(i) Pstart(i): prob. of token i is the starting token
        • P e n d ( i ) P_{end}(i) Pend(i): prob. of token i is the ending token 在这里插入图片描述
      • LSTM-based model

        • Feed question tokens to a bidirectional LSTM

        • Aggregate LSTM outputs via weighted sum to produce q, the final question embedding 在这里插入图片描述

        • Process passage in a similar way, using another bidirectional LSTM

        • More than just word embeddings as input

          • A feature to denote whether the word matches a question word
          • POS feature
          • Weighted question embedding: produced by attending to each question words 在这里插入图片描述
        • { p 1 , . . . , p m p_1,...,p_m p1,...,pm}: one vector for each passage token from bidirectional LSTM

        • To compute start and end probability for each token

          • p s t a r t ( i ) ∝ e x p ( p i W s q ) p_{start}(i) \propto exp(p_iW_sq) pstart(i)exp(piWsq)
          • P e n d ( i ) ∝ e x p ( p i W e q ) P_{end}(i) \propto exp(p_iW_eq) Pend(i)exp(piWeq) 在这里插入图片描述
      • BERT-based model

        • Fine-tune BERT to predict answer span

          • p s t a r t ( i ) ∝ e x p ( S T T i ′ ) p_{start}(i) \propto exp(S^TT_i') pstart(i)exp(STTi)
          • p e n d ( i ) ∝ e x p ( E T T i ′ ) p_{end}(i) \propto exp(E^TT_i') pend(i)exp(ETTi) 在这里插入图片描述
        • why BERT works better than LSTM

          • It’s pre-trained and so already “knows” language before it’s adapted to the task
          • Self-attention architecture allows fine-grained analysis between words in question and context paragraph

Knowledge-based QA

  • QA over structured KB
    • Many large knowledge bases
      • Freebase, DBpedia, Yago, …
    • Can we support natural language queries?
      • E.g.
        • ‘When was Ada Lovalace born?’ → \to birth-year (Ada Lovelace, ?x)
        • ‘What is the capital of England’ → \to capital-city(?x, England)
      • Link “Ada Lovelace” with the correct entity in the KB to find triple (Ada Lovelace, birth-year, 1815)
    • but
      • Converting natural language sentence into triple is not trivial
        • ‘When was Ada Lovelace born’ → \to birth-year (Ada Lovelace, ?x)
      • Entity linking also an important component
        • Ambiguity: “When was Lovelace born?”
      • Can we simplify this two-step process?
    • semantic parsing
      • Convert questions into logical forms to query KB directly

        • Predicate calculus
        • Programming query (e.g. SQL) 在这里插入图片描述
      • how to build a semantic parser

        • Text-to-text problem:
          • Input = natural language sentence
          • Output = string in logical form
        • Encoder-decoder model(but do we have enough data?) 在这里插入图片描述

Hybrid QA

  • hybrid methods

    • Why not use both text-based and knowledgebased resources for QA?
    • IBM’s Watson which won the game show Jeopardy! uses a wide variety of resources to answer questions
      • (question)THEATRE: A new play based on this Sir Arthur Conan Doyle canine classic opened on the London stage in 2007.
      • (answer)The Hound Of The Baskervilles
  • core idea of Watson

    • Generate lots of candidate answers from textbased and knowledge-based sources
    • Use a rich variety of evidence to score them
    • Many components in the system, most trained separately 在这里插入图片描述在这里插入图片描述在这里插入图片描述在这里插入图片描述
  • QA evaluation

    • IR: Mean Reciprocal Rank for systems returning matching passages or answer strings
      • E.g. system returns 4 passages for a query, first correct passage is the 3rd passage
      • MRR = 1/3
    • MCTest: Accuracy
    • SQuAD: Exact match of string against gold answer

Conclusion

  • IR-based QA: search textual resources to answer questions
    • Reading comprehension: assumes question+passage
  • Knowledge-based QA: search structured resources to answer questions
  • Hot area: many new approaches & evaluation datasets being created all the time (narratives, QA, commonsense reasoning, etc)
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值