新闻推荐算法

NewsgroupClustering Based On User Behavior A Recommendation Algebra

JussiKarlgren jussiQsics . se

March 1994 *

Abstract

User models are a tool for guidingsystem behavior in interactive systems, and their utility and properties,desirable and undesirable, have been investigated in this context. There areseveral ways of utilizing information about the user that have not beenimplemented, however. In this paper a scheme for users to peek at other users'user models to extract information is proposed, in an information retrieval orinformation filtering domain. The material used for the study is a set of.newsrc files.

Keywords

Dataextraction from user models; clustering.

Background

Compared to an ordinarily untidybookcase computerized systems for information retrieval may not always bebetter. In a normal bookcase interesting documents may be found next to eachother, and someone looking for a certain document may unexpectedly find otherinteresting documents in the vicinity. They are interesting because someoneplaced them there, and they are placed there because they have some relation tothe original document. An unorganized bookcase will self-organize somewhatunsystematically based on the user's behavior. In fact, in a library or abookstore, people around an interesting bookcase tend to be interesting people.You tend to be able to get good reading tips from them. Similarly, a goodlibrarian will remember that a certain book

This work hasbeen made possible by the generous IBM Electrum Scholarship 1992.

1

A

tends to be read by acertain set of people, and another book by the same set of people, and thatthere may be a similarity between the books, even though they may not becatalogized together as of course, they often will not be. Anyone who has triedto organize a bookcase by topic knows how many cases of unexpected categoryconflict one encounters.

Thesituation and the request in real information retrieval situations can often beformulated as a form of "1 read A Good Book I want more of the same" l, posed to a librarian or to a colleague or to a number of colleagues The idea is, as in all documentclassification, to use a distance measure to build a document space, and to useclustering algorithms to categorize documents in the space. This is a standardmethod, and most often, the distance measure used has to do with the content ofthe document, as judged by an intermediary such as a librarian or adocumentalist, or by a text search system, using keywords or free-text search.

In the applicationoutlined in this article, the distance measure will be based on knowledge aboutthe user or knowledge about the use of the document rather than knowledge aboutthe content or genre of the document. This knowledge is extracted from usermodels that indicate the preference of users.

The Domain

Initially, an experiment was performed wherenovels and video rental movies were used as a document base, but when the datamatrix proved to be too sparse, newsgroups on the Usenet News discussion andinformation dissemination network were used instead. The data used were .newsrcfiles of Usenet News users. The results will generalize in an obvious way from"documents" that in fact are Usenet News newsgroups to otherdocuments, and further to other similar retrievable objects, informationsources, or even patterns of behavior.

User Models

User modeling hasfound numerous applications and involves numerous research projects, usuallyhaving to do with tailoring output from interactive systems to suit the needsof particular users or user types. This paper will assume the existence of avery simple user model essentially containing user grades on documents.

Oneof the central questions when using user modeling techniques is how thecontents of the user model the grades, in our case are gathered. This is aninteresting question in itself, but will be left aside in this paper. Gradescan conceivably be collected through explicit questioning of the user (Rich,1979) through relevance feedback of some sort (Salton McGill, 1983) or through

 

I Or,as put by Benny Broclcla (1992): "Ginune More Of That"

inference onuser behavior (Kass 1991). In the application we have in mind now whether thegrades are obtained by explicit user recommendation or user behavior either byquerying users on their assessments of document relevance, or by examining userbehavior to determme which documents actually are accessed the methods for usang theinformation in them will be similar. The system will gather information aboutseveral users to service the needs of one specific user  in effect constructing a dynamic userstereotype for a specific query.

An Algorithm for ComputingProximity

The information m a set of simple usermodels can be used to compute a proximity measure between documents. The firststep is to define Interest as a relation between a user and a document. Theuser model assumed in this paper will be very simple, as noted in the previoussection and will be a simple collection of grades:

grade (User , Document ,Grade) .

The user modelcontains a vector of user grades. The documents in the document base are gradedby a user to be good "+" bad or not accessed "0 The user X is interested in a documentif the grade of that document is '+

The user X is unanterested in a document ifthe grade of that document is

The user X does not know the document if thegrade of that document is "0

A more complex gradepalette will naturally need further formalization: in this case all we need isa starting point to test the validity of the model

Interest As A ProximityMeasure

The basis for thealgorithm is the following statement which we will call the RecommendationHypothesis: If a user A is interested in documents IÝ and L and another user Xis interested in IÝ it is likely that X will also like L.

Phraseddifferently the hypothesis will be: If users agree on a document they willagree on others as well.

Thestarting point for the discussion was to find an adequate reply to the request"I read A Good Book — I want more of the same" . We will now inspectwhat cases may occur if similarity is defined according to the RecommendationHypothesis. User X poses the request above to the database. The databasecontains, among others entries for the users A, B and C as in the table below.

UserX will need to figure out what grades to pay most attention to. It is obviousthat user A has most to say: users X and A agree on the quality of A Good Book.User A also has an interest in document K. It seems reasonable to assume thatdocument IÝ may be interesting for user X as well. It is equally reasonable toassume that user C, who has not read A Good Book, will have little

A

User

 

Good Book

K

L

M

 

 

 

 

 

Table 1: Sample User Models

13

 

 

 

o

Another good book. A better book Don't know.

Warning

Another uninteresting book. Don't know.

Don't know Don't know

Don't know.

Table 2: A Qualitative RecommendationAlgebra

to contribute in this discussion, and thatdocument M, of which no user can say much will have an indeterminate grading asa result of the models inspected Formalizing these observations, wewill define recommendations as a product of interests.

Recommendations — An Algebra on Interests

Recommendations willbe defined as products of interests. Given Good Book, the parameters are whatinterest the user shows in it, and what the user can say of other books. Thematrix in table 2 covers the cases that can arise. The leftmost vertical columnindicates the grades the user has given A Good Book, and the top row, thegrades a user has given other books.

Now, having definedrecommendations, the question is how to use it. The likelihood of arecommendations being useful grows with the number of users that agree, and thenumber of documents they agree on. The whole point is to sum recommendationsover all users. To do this, we need to quantify them. The discussion in thepreceding section has defined the matrix shown in table 3 for therecommendation operator 13

 

 

 

 

o

1

   o      o

o

Table 3: A Template For A QuantitativeRecommendation Algebra

 

 

 

 

o

1 o o

o o o

o o o

Table 4: A Quantitative RecommendationAlgebra

 

 

 

 

o

1

o

-1

o

o

Table 5: Another Quantitative RecommendationAlgebra

Thequestion is what values to insert in the matrix in the cells now containingquestion marks. The simplest alternative is the one shown in table 4.

Another realization of is represented bythe matrix in table 5.

We will inthe experiment section below concentrate on the simpler algebra defined intable 4, and defer further discussion of more complex algebras.

Proximity — A Sum Over Recommendations

The proximity from a document to anotherfrom A Good Book to document

K can be defined as a sum over all readers'interests in A Good Book and K:

proximity(documentAGB ,documentK documenlAGB) C) interest(readeri,document K))

Thissum will then be used for clustering documents, using any standard algorithm.Note, however, that as we have defined them, the proximity measure does notneed to be symmetrical: the proximity from A Good Book to document K does nothave to be equal to the proximity from document K to Good Book. This naturallydepends on the definition of E). The two algebras defined in tables 4 and 5 aredifferent in this respect: the first is symmetric while the second is not.Standard clustering algorithms require the proximity or distance measure to besymmetric: this is one reason we here only address such simpler algebras.

 

Larger seed set

A natural extension of the discussion so faris to use several documents as a starting point — a seed set — for the query:"I read These Good Books I want more of the same" . It is notintuitively clear how the grades for one document should be compiled into onegrade: a simple sum will probably not give enough credit to documents thatcooccur with a large part of the seed set. We have not addressed this problemin this first study.

Clustering

As noted above standard clusteringalgorithms from statistical literature the proximity measure has to besymmetrical. (See for instance Miyamoto, 1989). This is not necessary for anyreason inherent in the nature of proximity or of documents but for practical reasonsto be able to use standard unmodified clustering algorithms we have chosen touse the algebra defined in table 4. The clustering algorithm we use is astandard average linkage agglomerative method

Experiments

Data Failure: Pilot Experiment

25 subjects were asked to grade 150 novelsand 150 video movies with one of the three grades above. This material wasprocessed with different @ matrices. This data proved unsatisfactory — thematrix was simply too sparse. The data set was too heterogenous and the reasonsbehind reading novels and watchin video movies too diverse, even m a therelatively homogenous population the subjects were taken from: all weregraduate students of Columbia University.

Usenet News Domain

A number of . newsrcfiles were gathered and processed. A typical . newsrc file has an appearance asin the excerpt shown in table 6. The newsgroup name is followed by a characteror which indicates if the newsgroup is subscribed to or not. The digitsfollowing the subscription character indicate which messages in the newsgrouphave been read. These could be utilized to refine the grade set from binary toa continuous scale but we have elected to keep the model simple, initially. The@ matrix used was the one shown in table 7. One of the good points of usingUsenet News data is that it is reasonably easy to validate the clustering bythe newsgroup names, which are fairly indicative of content.

news . announce .conferences: 1—5260 news . announce . important: 1—51 alt . tv. rockford—files:1—253 svnet . jobs: 1—232 misc . jobs . offered! 1—31344 dk. jobs! 1-24 comp .lang.-prolog: 1-9033 comp . risks: 1—6314 comp . society . futures! 1—3343 comp. society . privacy: 1—1993 comp . cog eng! 1—2247 comp . ai .  1—1577 comp.ai! 1-17353 sics . general: 1—244sics . sicstus: 1—404 sics . syschanges: 1—1060 sics . personal . forening:1—76

Table 6: An excerpt from a typical .newsrcfile

 

 

 

 

1 o

o

Table 7: Experiment QuantitativeRecommendation Algebra

local. general swnet.jobs local.syschanges local.personal.forening local. alla local.sicstus comp. lang. prolog local.protokoll local.mac local.system local.librar comp. al news. announce. conferences local. test news.announce.irnportant swnet.pryltorg swnet.ai.neural-nets kth. unix comp. mi. shells news. announce.newusers misc.jobs.offered comp. mi. digest kth.data swnet.ut bildning.grundbulten swnet.sys.cv

swnet. sources. list swnet.diverse kth.sun kth.rnac kth.general eunet.news comp. al. vision comp.ai.nlang-know-rep comp.ai.edu swnet.rnail.rnap sci. logic local. ai-in-medicine alt. bin aries.pict ures.supermodels alt. binaries. pictures. erotica. blondes alt. binaries. pictures. erotica.female alt. binaries. pictures.erotica news.answers comp. cog-eng swnet. conferences aus.jobs comp.newprod swnet .sunet-info swnet.org.snus alt. crackers comp. sys. workstations

Table 8: The fifty most popular newsgroupsin experiment 1 in descending order of popularity.

Experiment 1: n = 60

In the firstexperiment sixty .newsrc files were used. In this data set, the fifty mostsubscribed? newsgroups are the ones shown in table 8. If recommendationstatistics for these newsgroups are calculated we will get the matrix shown intable 9. This data is transformed to a distance measure matrix to get thematrix in table 10 by adding one to each element (to avoid zero values) andthen inverting them all.

Thedistance matrix in table 10 clustered as shown in table 11. Some of theclusters are remarkably well put together, whereas others may be seem morehaphazard. There is a local news set (including job ads!) , a local system set,a technical set, a Prolog set, an Al set, a swnet set, a erotic picture set,and a

 

2 Itwould be more interesting to be able to say that these are the most readnewsgroups. This we cannot CIO, although it is conceivable that we couldextract this from the numerical info on seen articles that follows thesubscription information. It is not self-evident, however, that an articlewhich is seen has also been read. For this reason, initially, we will becautious about making the inference.

16 ,    9 , 11 , 10 ,      7 , IO,7

9 ,

     7 , 6 , 5 ,  7 ,

5 ,

 

4 ,

 

5 ,

5 ,

6

 

7 , 11 , 8 ,                     7 ,

5 ,

6 ,

6 ,

5 ,

5

11 ,

6 ,

8 , 1 1 , 7 , 5 ,

9 ,

8 ,

6 ,

8 ,

4

10 ,

5 ,

5 , 8 , 1 1 , 8 ,                    

7 ,

6 ,

6 ,

5 ,

3

8 ,

 

 7 , 8 , 10 , 6 ,

5 ,

 

7 ,

4 ,

3

 

 

7 , 5 , 5 , 6 , 10 ,

3 ,

4 ,

4 ,

4 ,

5

10 ,

5 ,

      7 , 5 , 3 ,

 

7 ,

 

9 ,

4

9 ,

4 ,

6 , 8 , 6 , 5 ,

 

9 ,

 

7 ,

4

7 ,

5 ,

6 , 6 ,  7 , 4 ,

5 ,

 

9 ,

4 ,

2

9 ,

5 ,

5 , 8 , 5 , 4 , 4 ,

 

7 ,

4 ,

9 ,

5

7 ,

6 ,

5 , 4 , 3 , 3 , 5 ,

4 ,

4 ,

2 ,

5 ,

8

Table 9: Excerpt from proximity matrix forthe fifty most popular newsgroups Inexperiment 1

0 . 05 ,

0 . 10 ,

0 . 10 ,

0 . 08 ,

0 . 09 ,

0 . 11 ,

0 . 12 ,

0 . 09 ,

0 . 10 ,

0 . 12 ,

0 . 10 ,

0 . 12

0 . 10 ,

0 . 07 ,

0 . 12 ,

 14 ,

0 . 16 ,

0 . 16 ,

0 . 12 ,

0 . 16 ,

0 . 20 ,

0 . 16 ,

0 . 16 ,

O . 14

0 . 10 ,

0 . 12 ,

0 . 08 ,

0 . 11 ,

0 . 16 ,

0 . 12 ,

0 . 12 ,

0 . 16 ,

 14 ,

 14 ,

0 . 16 ,

0 . 16

0 . 08 ,

 14 ,

0 . 11 ,

0 . 08 ,

0 . 11 ,

0 . 12 ,

0 . 16 ,

0 . 10 ,

0 . 11 ,

 14 ,

0 . 11 ,

0 . 20

0 . 09 ,

0 . 16 ,

0 . 16 ,

0 . 11 ,

0 . 08 ,

0 . 11 ,

0 . 16 ,

0 . 12 ,

 14 ,

 14 ,

0 . 16 ,

0 . 25

0 . 11 ,

0 . 16 ,

0 . 12 ,

0 . 12 ,

0 . 11 ,

0 . 09 ,

 14 ,

0 . 16 ,

0 . 16 ,

0 . 12 ,

0 . 20 ,

0 . 25

0 . 12 ,

0 . 12 ,

0 . 12 ,

0 . 16 ,

0 . 16 ,

 14 ,

0 . 09 ,

0 . 25 ,

0 . 20 ,

0 . 20 ,

0 . 20 ,

0 . 16

0 . 09 ,

0 . 16 ,

0 . 16 ,

0 . 10 ,

0 . 12 ,

0 . 16 ,

0 . 25 ,

0 . 08 ,

0 . 12 ,

0 . 16 ,

0 . 10 ,

0 . 20

0 . 10 ,

0 . 20 ,

 14 ,

0 . 11 ,

 14 ,

0 . 16 ,

0 . 20 ,

0 . 12 ,

0 . 10 ,

0 . 20 ,

0 . 12 ,

0 . 20

0 . 12 ,

0 . 16 ,

 14 ,

 14 ,

 14 ,

0 . 12 ,

0 . 20 ,

0 . 16 ,

0 . 20 ,

0 . 10 ,

0 . 20 ,

0 . 33

0 . 10 ,

0 . 16 ,

0 . 16 ,

0 . 11 ,

0 . 16 ,

0 . 20 ,

0 . 20 ,

0 . 10 ,

0 . 12 ,

0 . 20 ,

0 . 10 ,

0 . 16

0 . 12 ,

 14 ,

0 . 16 ,

0 . 20 ,

0 . 25 ,

0 . 25 ,

0 . 16 ,

0 . 20 ,

0 . 20 ,

0 . 33 ,

0 . 16 ,

0 . 11

Table 10: Excerpt from distance matrix forthe fifty most popular newsgroups m experiment 1

couple of less easily labelable butnonetheless not completely weird sets.

Experiment 2: n = 600

In the secondexperiment six hundred .newsrc files were used. In this data set the fifty mostsubscribed newsgroups are the ones shown in table 12. The recommendationstatistics for these newsgroups are shown in table 13. This data is again, asin the previous experiment transformed to a distance measure matrix to get thematrix in table 14 by adding one to each element (to avoid zero values) andthen inverting them all.

The distance matrix intable 14 clustered as shown in table 15. Again, some of the clusters areremarkably well put together whereas others may be seem more haphazard. Most ofthem have a reasonably well-defined profile however. There IS a general Swedishdiscussion group set, a programming group set, a network set, a picture set, aUppsala news set (including classified ads and job announcements), and finally,the two local newsgroups.

Relevance as a Tool for Information Retrieval

A distance or proximity measure based onheuristics like these can only be expected to produce useful results up to apoint. This measure IS m a certain sense orthogonal to standard metrics fordocument classification: indexing schemes word statistics and the like. It doesnot analyze the material with respect to its content as do standard informationretrieval metrics. Neither does it take into account the formal guzse of the materialas do some experimental metrics, for instance in calculating probable textgenre (Karlgren Cutting, 1994). This means that it can be expected to produceresults as an additional layer added on to existing and future traditionalcontent- and genre-based information retrieval mechanisms not as a completetool on its own.

Relevance to the Study of User Modeling

We do not expect thetool to be very useful in the Usenet News domain: this experiment was just toshow the utility the algorithm on easily available data. In the Usenet Newsdomam most users have a reasonable overview of what is available on the net.However, we will evaluate the technique in the IntFilter project at SICS andStockholm university. The IntFilter project aims at producing interactivefiltering tools for information flows, and we will use the clustering mechanismdescribed here to produce standard stereotypical . newsrc user models for newusers. New users can then get a whole package of files to start the interactionwith without having to immerse themselves in the entire Usenet News flow.

 

local . general swnet • jobs local .syschanges local . personal . forening

swnet .utbildning . grundbul ten swnet .sys . cv swnet . sources . list swnet . diverse swnet . conferences

comp . a 1 news . announce . conferencesc omp. ai . shells c omp. ai . nlang-know-rep

swnet . ai . neural-nets c omp. ai . edu

c omp. ai . vision sci . logic

swnet . sunet- info swnet . org . snus

alt . binaries . pictures . supermodelsalt . binaries . pictures . erotica. blondes alt . binaries . pictures .erotica. female alt . binaries . pictures . erotica

misc . jobs . offered

local . ai-in-medicine

comp . newprod

local. protokoll local . mac local . library news . announce . important

univ . unix univ . mac univ . generaleunet . news swnet . mail . map

local . system local . test univ . datauniv . sun

alt . crackers comp . sys . workstations

news . answers news . announce .newusers

swnet . pryltorg aus . jobs

local . alla local . sicstus comp . lang. prolog

comp . ai . digest

comp . cog-eng

 

Table 11.• Clusters m experiment 1.

 

news. announce.newusers    alt.sex.stories swnet.jobs swnet.ai.neural-nets swnet.pryltorg     uppsala.mail uppsala.news     swnet.sunet-info uppsala.general  swnet.test swnet.general alt. binaries. pictures. utilities

swnet.wanted       swnet.svenskaswnet.diverse  nordunet.generalswnet.conferences     alt. sexuppsala.test   alt.3d uppsala.games     swnet.siren local.news    swnet.sys.sun swnet.umx     swnet.sources swnet.sys.amiga      comp.sources.umx swnet.sys.ibm.pc      alt. sex. pictures. female swnet. followup     swnet.snus swnet.sys.mac      alt. sources swnet.mail    gnu. announce alt. binaries. pictures.supermodels   comp.sources.x comp.lang.c++    swnet.sources.listswnet.politik       alt. binaries.pictures. nusc news. announce.lmportant     comp.lang.calt. binaries. pictures  alt. binaries.pictures. erotica local.diverse       swnet.sys.sun.flashcomp.lang.prolog    swnet.mac

Table 12: The fifty most popular newsgroupsin experiment 2 in descending order of popularity.

209 , 66 , 59 , 48 , 56 ,

57 ,

50 ,

45 ,

50 ,

57 ,

40 ,

24,

47

66 , 161 , 115, 77 , 79 ,

85 ,

85 ,

63 ,

79 ,

63 ,

58 ,

45 ,

71

59 , 115 ,

    74, 77 ,

80 ,

84,

63 ,

74,

66 ,

53 ,

38 ,

70

48 ,

77 ,

74, 119 , 86 ,

51 ,

45 ,

39 ,

57 ,

45 ,

72 ,

48 ,

45

56 ,

79 ,

77 , 86 , 115 ,

63 ,

52 ,

42 ,

59 ,

59 ,

68 ,

40 ,

54

57 ,

85 ,

80 , 51 , 63 ,

95 ,

69 ,

54,

65 ,

62 ,

40 ,

22 ,

65      

50 ,

85 ,

84, 45 , 52 ,

69 ,

92 ,

42 ,

59 ,

55 ,

36 ,

23 ,

61    

45 ,

63 ,

63 , 39 , 42 ,

54 ,

42 ,

90 ,

40 ,

42 ,

24,

28 ,

40      

50 ,

79 ,

74, 57 , 59 ,

65 ,

59 ,

40 ,

90 ,

55 ,

37 ,

27 ,

57

57 ,

63 ,

66 , 45 , 59 ,

62 ,

55 ,

42 ,

55 ,

87 ,

33 ,

 

 

40 ,

58 ,

53 , 72 , 68 ,

40 ,

36 ,

24,

37 ,

33 ,

86 ,

34,

37

24,

45 ,

38 , 48 , 40 ,

22 ,

23 ,

28 ,

27 ,

 

34 ,

84,

22

47 ,

71 ,

70 , 45 , 54,

65 ,

61 ,

40 ,

 

46 ,

37 ,

22 ,

83

Table 13: Excerpt fromproximity matrix for the fifty most popular newsgroups Inexperiment 2.

o. 004761904761904762,

0.014925373134328358,

o .016666666666666666 ,  

o .014925373134328358 ,

0.006172839506172839,

o. 008620689655172414 ,  

o .016666666666666666 ,

o. 008620689655172414,

o .006896551724137931 ,  

o. 020408163265306120 ,

0.012820512820512820,

o .013333333333333334 ,  

Table 14: Excerpt from distance matrix forthe fifty most popular newsgroups m experiment 2.

swnet . wanted

comp . lang . c++

swnet . conferences

comp . lang . prolog

uppsala . test

alt . sex

swnet . unix

comp . sources . unix

swnet . sys . amiga

alt . sources

swnet . sys . ibm.pc

gnu . announce

swnet .followup

comp . sources . x

swnet . sys . mac

swnet . sources . list

swnet . mail swnet . politik

comp . lang . c

news . announce . important

swnet . ai . neural-nets

swnet . sunet- info

swnet . test

swnet . sys . sun

nordunet . general

swnet . sources swnet . snus

swnet . siren

news . announce . newusers

alt . binaries . pictures

swnet . jobs

alt . binaries . pictures . ut ilities

swnet . pryltorg uppsala . news

alt . sex. stories

uppsala . general

local . news

swnet . general

alt . 3d swnet . sys . sun. flash

local . diverse

swnet . diverse

uppsala . games

uppsala . mail

alt . binaries . pictures . supermodels

swnet . svenska

alt . sex . pictures . female

alt . binaries . pictures . misc

alt . binaries . pictures . erotica

Table 15.• Clusters m experiment 2.

Interactivity Aspects

How to design an interaction usingtechniques such as these that by necessity will seem complex to the casual userIS a question we address m a separate publication (Karlgren et al. 1994); inconnection with this technique we must note that a complex algebra, such as theone tentatively defined in table 5 will be difficult to explain to the user.Indeed, one of the reasons we did not pursue the study of it further was thecomplexity involved in debugging output from the program. Even in developmentstages, when the algorithm was parametrizable and highly salient its behaviorbecame complex. This is not the last word on algebra design however: thealternative quantifications must be studied further before judgement is passedin this matter.

Integrity Aspects

An obvious stumblingblock for utilizing user models in this manner is that of user integrity.Integrity questions are important to consider and difficult to resolve in astraightforward way. Indeed this experiment itself is an illustration of thepersonal integrity problem complex. The .newsrc files used in the experimentwere taken from open available systems at universities and research institutesthe IntFilter project has access to. The users themselves were not asked butwere assumed to have given their permission implicitly by their having theprotection of the .newsrc files set so that they were readable outside theirimmediate work group. The .newsrc files were immediately de-identified so none of the data can be attributed toany single user — however, no users actually were made aware of the fact thattheir reading habits were bandied about for public scrutiny. A tentativesolution to empower the user, would be to allow the user an unlimited number ofidentities thus letting users partition their reading habits by pseudonyms(Bratt et al. 1983). Obviously this does not solve all problems but a solutionof this type at least redresses some of the balance that the system otherwisetakes from the user.

References

Hans Iwan Bratt, HansKarlgren, Ulf Keijer, Tomas Ohlin, Gunnar Rodin. (1983). "En liberaldatapolitik", Tempus, 16-19/12, Stockholm.

BennyBrodda. (1992). "Gimmie More O 'That" IRI Publication Stock1101m: IRI

JussiKarlgren and Douglass Cutting. 1994. "Recognizing Text Genres with SimpleMetrics Using Discriminant Analysis" Submitted to Coling 1994, Kyoto.

JussiKarlgren, Kristina Höök, Ann Lantz, Jacob Palme, Daniel Pargman. (1994)."The glass box user model for filtering. Submitted to the FirstInternational Conference On User Modeling, Cape Cod.

Jussi Karlgren (1992).The Interaction of Discourse Modality and User Expectations an Human-ComputerDialog, Licentiate Thesis at the Department of Computer and Systems SciencesUniversity of Stockholm.

Jussi Karlgren.(1990). "An Algebra for Recommendations" , Syslab Working Paper 179Department of Computer and System Sciences Stockholm University, Stockholm.

Robert Kass. (1991). "Building a User Model Implicitly from a Cooperative AdvisoryDialog" , UMUAI 1:3, pp.203-258

SadaakoMiyamoto. (1989). Fuzzy Sets in Information Retraeval and Cluster Analyszs.Dordrecht: Kluwer.

ElaineRich. (1979). "User Modeling via Stereotypes' Cognitive Scaence, vol 329—354.

GeraldSalton and Michael McGill. (1983). Introduction to Modern Information RetraevalNew York: McGraw-Hill

 

 

 

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值