Web Based Information Architectures
Fall 2004

A Graduate Course for Computer Science Major, fall 2004, Peking University
Instructor: Professor LI Xiaoming, lxm@pku.edu.cn


Reading Assignments

Presentation of readings assignment will begin from 5th Nov. and is arranged in five section (look at the time table below). Everyone has 10-15 minutes for his presentation and start sequentially according to his id number.  Of course, you can change it with negotiation to your classmates for time conflicts.  Especially, the students who have class in Friday afternoon can begin with higher priority from 12:00pm in Prof. Wang's section.
Section 5 of Prof. Wang is reserved for those who were absent in his arranged time.
Please turn in your report and presentation slides to the appointed teacher in advance.
Enjoy the time , :)

Location: Meeting Room 1721,Science Building 1

¡¡ ¢Ù ¢Ú ¢Û ¢Ü ¢Ý
Wang 11/5/2004
12:00pm-16:00pm
11/12/2004
12:00pm-16:00pm
11/19/2004
12:00pm-16:00pm
11/26/2004
12:00pm-16:00pm
12/17/2004
12:00pm-16:00pm
Peng 11/7/2004
14:00pm-17:30pm
11/14/2004
14:00pm-17:30pm
11/21/2004
14:00pm-17:30pm
11/28/2004
14:00pm-17:30pm
12/5/2004
14:00pm-17:30pm

¡¡

ID NAME PAPER Schedule
10208824 ÃÏÌÎ A statistical physics perspective on Web growth.pdf Peng¢Ý
10380026 ÍõÎÄÃ÷ A Comparative Study on Feature Selection in Text Categorization (1997).pdf ¡¡

¡¡

Wang
¢Ù

¡¡

10380043 ³ÂÈðâù Evaluating and Optimizing Autonomous Text Classification Systems (1995).pdf
10308081 ³Â¹ú»Ô A Tutorial on Support Vector Machines for Pattern Recognition (1998).pdf
10308085 ³ÂÏÉ Machine Learning in Automated Text Categorization (1999).pdf
10308087 ´Þ¸ßÓ± Optimized Query Execution in Large Search Engines with Global Page Ordering (2003)
10308088 Û¡éª A New Family of Online Algorithms for Category Ranking.pdf
10308127 ÂÞêØ¹â A Scalability Analysis of Classifiers in Text Categorization.pdf
10308131 ÂíÔÆÏö A study of Approaches to Hypertext Categorization.pdf
10308140 Éò¼á A Tutorial on Automated Text Categorisation.pdf
10308170 ÐÒÔË An Evaluation of Statistical Approaches to Text Categorization.pdf
10308171 Ð춬 Bayesian Online Classifiers for Text Classification and Filtering.pdf
10308174 Ѧ´óÓî Building Text Classifiers Using Positive and Unlabeled Examples.pdf
10308183 ÕÅºÆ Combining Naive Bayes and n-Gram Language Models for Text Classication.pdf
10308185 ÕÅÀÙ Mercator- A Scalable, Extensible Web Crawler.pdf Peng¢Ù
10308190 ÕÔ¾² Improving text categorization methods for event tracking.pdf
Wang
¢Ù
10308197 ´÷éª Partitioning-based clustering for web document categorization.pdf
10308201 Íõΰ Web Page Classification based on Document Structure.pdf
10308840 Àîçü Clustering Hypertext With Applications To Web Searching (2000).pdf ¡¡

¡¡

Wang
¢Ú

10312011 ÑîÎÄ°× Scaling Clustering Algorithms to Large Databases (1998).pdf
10312054 µËáÍ Data Clustering A Review (1999).pdf
10312060 Ðìºè Spectral Analysis of Data.pdf
10330016 ´Þ½¨º£ Initialization of Iterative Refinement Clustering Algorithms (1998).pdf
10380021 Áø½£·æ A Comparison of Document Clustering Techniques.pdf
10412012 ÕÅöÎ An algorithm to cluster documents based on relevance.pdf
10448140 ÀîÏ£æÃ Clickstream Clustering using Weighted Longest Common Subsequences.pdf
10448144 ÁºÏ£ÔÆ clustering of web users based on access patterns.pdf
10448145 Áõµ¤ÐÇ Document Clustering Based On Non-negative Matrix Factorization.pdf
10448149 ÁõÒæ³É Document Clustering with Cluster Refinement and Model.pdf
10448152 Ì·Ìì Document Clustering with Committees.pdf
10448153 Ìï·ã Agglomerative clustering of a search engine query log.pdf
10448168 ÕÅÀÚ Explaining Text Clustering Results using Semantic Structures.pdf
10448172 ÖìÒàÕæ Query Clustering Using User Logs.pdf
10448177 °×»ª Focused crawling a new approach to topic-specific Web resource discovery (1999).pdf ¡¡

¡¡

¡¡

Peng
¢Ù

10448178 °üÓ¾ü Crawling the HiddenWeb(2001).pdf
10448180 ³ÂÊØÁÁ Focused Crawling Using Context Graphs (2000).pdf
10448181 ¸ßº£±ó A Community-Aware Search Engine.pdf
10448182 ¹ùî£ Design and Implementation of a High-Performance Distributed Web Crawler.pdf
10448183 º£Àö¾ê Efficient URL Caching for World Wide Web Crawling.pdf
10448184 ºÎ¾¸ High-Performance Web Crawling (2001) .pdf
10448186 »Æ¼Î¶ Optimal Crawling Strategies for Web Search Engines.pdf
10448188 Àî·Ò Parallel Crawlers (2002).pdf
10448190 ÁÖ»³¶« UbiCrawler A Scalable Fully Distributed Web Crawler (2002).pdf
10448191 Áõæ­ What's New on the Web The Evolution of the Web from a Search Engine Perspective(2004)
10448193 ÂíµÏ (Brin&Page1998)The Anatomy of a Large-Scale Hypertextual Web Search Engine.pdf ¡¡

¡¡

Peng
¢Ú

10448194 ÂíÇ¿ Estimating the Usefulness of Search Engines (1999).pdf
10448196 ÇñÖ¾»¶ Do TREC web collections look like the web.pdf
10448198 Ê©äø Efficient Construction of Large Test Collections (1998).pdf
10448201 ÍõÀÚ Engineering a multi-purpose test collection for Web retrieval experiments (2001).pdf
10448205 ÕÔµ¤ Evaluating Evaluation Measure Stability (2000) .pdf
10448206 ÕÔÑÅ How reliable are the results of large-scale information retrieval experiments.pdf
10448207 ÕÔÑô Measuring Search Engine Quality.pdf
10448208 ÖÜÕþ Unbiased Evaluation of Retrieval Quality using clickthrough data.pdf
10448209 ×Þ¿¡·å Results and challenges in Web search evaluation.pdf
10448211 ³ÂÁ¼ Information Extraction with HMM Structures Learned by Stochastic Optimization (2000) ¡¡

¡¡

¡¡

Peng
¢Û

10448215 ³ÂÔª Querying the World Wide Web (1997).pdf
10448216 ³É¸» Information Extraction Using Hidden Markov Models (1997).pdf
10448218 ³ÌÖ¾ Learning Hidden Markov Model Structure for Information Extraction (1999).pdf
10448220 ¶¡Ê÷¿­ Semi-automatic Wrapper Generation for Internet Information Sources (1997).pdf
10448221 ¶¡ÍòËÉ Relational Learning of Pattern-Match Rules for Information Extraction (1998).pdf
10448222 ñ¼ÎÄÃô Nymble High-Performance Learning Name-Finder (1997).pdf
10448226 ¹Ø·½ÐË Learning Information Extraction Rules for Semi-structured and Free Text (1999).pdf
10448227 ¹ù»¯éª Automatic Web News Extraction Using Tree Edit Distance.pdf
10448228 ºÎÖÜÖÛ Information Extraction from World Wide Web A Survey (1999).pdf
10448229 ºÍÔÆ·å Winners dont take all.pdf
10448230 ºîäìäì Using the Structure of Web Sites for Automatic Segmentation of Tables.pdf
10448235 Àîê» Web-Scale Information Extraction in KnowItAll.pdf
10448237 Àî¾²¾² Automatic Word Sense Discrimination (1998).pdf ¡¡

Peng
¢Ü

10448238 ÀîïÇ Finding Approximate Matches in Large Lexicons (1995).pdf
10448239 ÀîÇÚ·É Compression of Inverted Indexes For Fast Query Evaluation (2002).pdf
10448242 ÀîÓñ¾ê A Fast Regular Expression Indexing Engine.pdf
10448244 ÁÖÁÁ Adding Compression to Block Addressing Inverted Indexes (2000) .pdf
10448245 Áõ³¤·É Compressed Inverted Files with Reduced Decoding Overheads.pdf
10448249 ÁõÒ¢ Modeling Score Distributions for Combining the Outputs of Search Engines
10448251 ÁõÓáò¡ Efficient Phrase Querying with an Auxiliary Index (2002).pdf
10448253 Âí³ÒÓî On the Integration of Structure Indexes and Inverted Lists.pdf
10448255 Å·ÑôÓÓ Text Categorization Based on Regularized Linear Classification Methods (2000).pdf Wang¢Ù
10448257 ÅËС˫ What's Next Index Structures for Efficient Phrase Querying  .pdf
Peng¢Ü
10448258 ÅíÓî Self-Indexing Inverted Files for Fast Text Retrieval (1996).pdf
10448259 ÅíÔÿ­ A Language Modeling Approach to Information Retrieval (1998).pdf ¡¡

¡¡

¡¡

Wang
¢Û

10448264 Ëïèò Improved Algorithms for Topic Distillation in a Hyperlinked Environment (1998).pdf
10448268 Íõè± Automatic Resource list Compilation by Analyzing Hyperlink Structure and Associated Text (1998).pdf
10308164 ÎâÖÇ·¢ Information Retrieval on the Web.pdf
10448272 ÎâÃ÷»Ô Learning Routing Queries in a Query Zone (1997).pdf
10448274 Ðì´ºÏã Information Retrieval as Statistical Translation (1999).pdf
10448277 ÑîÀ¤ Query Expansion Using Local and Global Document Analysis (1996).pdf
10448282 Ô¬Áá A System for New Event Detection.pdf
10448284 Õž² An algorithmic framework for collaborative filtering.pdf
10448286 ÕÅÇØÁú SALSA The Stochastic Approach for Link-Structure Analysis.pdf
10448289 ÕÅìÏÓî Toward a unified approach to statistical language modeling for Chinese .pdf
10448290 ÕÔ¶« Indexing by Latent Semantic Analysis (1990).pdf
10448291 ÕÔæº Fast Algorithms for Mining Association Rules (1994).pdf
10448292 ÖÜÁ¢ Analysis of a Very Large Web Search Engine Query Log.pdf
10448293 ÖÜϼ Entropy-Based Link Analysis for Mining Web Informative Structures.pdf ¡¡

¡¡

¡¡

Wang
¢Ü

10448294 ÖÜÏþ³ Improvement of HITS-based Algorithms on Web Documents.pdf
10448295 ÖÜÖ¾Ô¶ Mining Anchor Text for Query Refinement.pdf
10448296 Öì¼ÎÆæ Mining Logest repeating subsequences to predict world wide web surfing.pdf
10448297 ÖìÑÇ Probe, Count, and Classify_ Categorizing Hidden Web Databases.pdf
10448298 ׯÀÚ Topic-Sensitive PageRank.pdf
10448299 ×Þ½¡ Web Usage Mining in SE.pdf
10448303 µË¹ú Authoritative Sources in a Hyperlinked Environment (1998).pdf
10448304 ¹ùÐÂÓî Item-based Collaborative Filtering Recommendation Algorithms (2001).pdf
10448305 »ÆæÃ¶ù Preserving two-level caching for scalable search engines .pdf
10448306 ½­Áë Finding replicated web collections.pdf
10448307 Àîºè Predictive Caching and Prefetching of Query Results in Search Engines.pdf
10448309 ÀîÄþ²¨ Rank-Preserving Two-Level Caching for Scalable Search Engines.pdf
10448310 ÀîÑà STARTS Stanford Proposal for Internet Mets-Searching.pdf
10448311 ÁÎËØÃô Keeping Up With The Changing Web (2000).pdf ¡¡

¡¡

¡¡

Peng
¢Ý

10448312 ÁõÕñ¶« Estimating frequency of change.(2003).pdf
10448313 ÂÀ½à A Re-Examination of Text Categorization Methods (1999).pdf
10448314 Âí¹úÒ« A stochastic model for the evolution of the Web.pdf
10448317 ́Ȼ Effective page refresh policies for Web crawlers.(2003).pdf
10448318 ÕÅÅô How dynamic is the Web.pdf
10448319 ÕÅÇÙ The Evolution of the Web and Implications for an incremental Crawler(2000).pdf
10448321 ÕÔ¾² Whats New on the Web The Evolution of the Web from a Search Engine Perspectivepdf.pdf
10448322 Öìΰ Estimating the Relative Size and Overlap of Public Web Search Engines
10448323 ²Ü¶«Ö¾ Evolutionary Dynamics of the World Wide Web..pdf
10448324 ·ëÉÙ»Ô Graph structure in the web
10448325 ÁõÎÄ Impact Of Search Engines On Page Popularity.pdf
10448326 ·±ó Modeling the Growth of Future Web.pdf
10448327 ÓȺ£Åô Representing Web Graphs.pdf
10448330 ÖìÖÒΰ Searching the World Wide Web (1998) .pdf

Last modified: 2004-11-28 22:50:00