1.1 use of computation
Computers and algorithms were originally developed to solve what might be called concrete tasks.such as :
- compute a missile trajectory.
- crack a code( decryption)
In common: the task is well-defined, we can assess whether the solution is correct.
In these tasks, the data is transformed in a mechanical way or leads to a mechanical action, but only in a very limited way do they enhance our (that is, human) knowledge.
Hence - not “Knowledge Technologies”.
1.2 knowledge tasks
Data: measurements (bit patterns for computers)
Information: processed data; patterns that are satisfied for given data.
Knowledge: information interpreted with respect to a user’s context to extend human’s understanding in a given area.
concrete task: well-defined, mechanically processing data to an unambiguous solution.
Knowledge task: data is unreliable or outcome is ill-defined (usually both); computer mediate between user and data, where context for the user is critical. Enhance human’s understanding.
structured data: conforms to a schema (e.g. database).
unstructured data: data without regular decomposable structure (e.g. plaintext).
semi-structured data: data which corresponds in part to a schema, but irregular or rapidly changing.
consider tasks where the data is irregular or unreliable, or the outcome is not well-defined:
- Translation between languages.
- Finding an “optimal” route between two locations.(optimal ? distance, time, fuel?)
- Deciding what movie to watch.
This is not a computational task - but we do use computer to mediate between us and data, in helping to reach a decision.
Context is critical: the origin of the data, the consumer of the output.
These use, produce, or enhance human knowledge.
data = raw information
knowledge = patterns or models behind the data
1.3 methods for data analysis
1.3.1 supervised learning
- classification: predicting a discrete class
- regression: predicting a numeric quantity
1.3.2 unsupervised learning
- association: detecting associations between features
- information organisation; clustering: grouping similar instances into clusters
- reinforcement learning
- recommender systems
- anomaly/outlier detection