A CLUSTERING BASED WEB PREFETCHING IN HIGH TRAFFIC ENVIRONMENT

TABLE OF CONTENTS
TITLE
ABSTRACT

CHAPTER ONE
INTRODUCTION
1.1       Background of Study
1.2       Problem Statement
1.3       Motivation
1.4       Aim and Objectives
1.5       Research Method
1.6       Organization of Dissertation

CHAPTER TWO
LITERATURE REVIEW
2.1       Introduction
2.2       Web Caching
2.3       Types of Web Cache
2.3.1    Client Side Cache
2.3.2    Proxy Server Cache
2.3.3    Origin Server Cache
2.4       Cache Replacement Policy
2.5       Proxy Caching
2.5.1    Forward Proxy Caching
2.5.2    Reverse Proxy Caching
2.5.3    Transparent Caching
2.6       Web Prefetching
2.6.1    Short Term Prefetching
2.6.2    Long Term Prefetching
2.7       Related Works
2.8       Literature Gap and Contribution of This Work

CHAPTER THREE
HIGH TRAFFIC WEB PREFETCHING MODEL
3.1       Introduction
3.2       The Prefetching Model Architecture
3.3       Preprocessing of Proxy Access Log Files
3.3.1    Data Collection
3.3.2    Data Cleaning
3.3.3    User & Session Identification
3.4       Clustering Of Preprocessed Log Files
3.4.1    Web Navigation Graph (WNG)
3.7       Inter Clustering

CHAPTER FOUR
IMPLEMENTATION AND RESULT
4.1       Introduction
4.2       Implementation Details
4.2.1    Programming Language
4.2.2    Squid Proxy Server
4.2.3    Dataset
4.3       System Specification
4.4       Implementation Result
4.4.1    System Model
4.5       Performance Evaluation Criteria
4.6       Discussion of Results
4.7.1    Accuracy of Prediction
4.7.2    Usefulness of Prediction
4.7.3    Precision
4.7.4    Hit Ratio
4.7.5    Byte Ratio
4.7       Summary of The Result

CHAPTER FIVE
SUMMARY, CONCLUSION AND RECOMMENDATIONS
5.1       Summary
5.2       Conclusion
5.3       Recommendations
REFERENCES

ABSTRACT
The continued increase in demand for objects on the Internet causes high web traffic and consequently low user response time which is one of the major bottleneck in the network world. Increase in bandwidth is a possible solution to the problem but it involves increasing economic cost. An alternative solution is web prefetching. Web prefetching is the process of predicting and fetching web pages in advance by proxy server before a request is sent by a user. Prefetching is performed during the server idle time. Most literature based on the classical prefetch algorithm assumes that the server idle time is large enough to prefetch all user’s predicted requests which is not true in a real life situation. This research aims at improving the web prefetching technique by developing a prefetching technique that can be effective in a high traffic environment when the server idle time is very low.Log files were collected and preprocessed for several client group within a domain. The preprocessed log files were used to create web navigation graph, which shows the transition from one web page to another web page.Support and confidence threshold were used to remove web pages with values less than the threshold values. Several clusters were formed in a particular client group. When the prefetch time is predicted to be too small to prefetch, the entire clusters formed from various domains will be used to create a prioritized cluster based on several user request. The model was evaluated based on hit rate, byte rate, precision, accuracy of prediction and usefulness of prediction. The result shows that the proposed WebClustering algorithm performs better than the classical prefetch technique when the server idle time is small and behaves same as the classical algorithm as the server time becomes large enough to prefetch all users predictions.

CHAPTER ONE
INTRODUCTION
1.1              Background of Study
The web is a collection of text documents and other resources, linked by hyperlinks and Uniform Resource Locator (URLs), usually accessed by web browsers, from web servers. The web started from a simple information sharing system, and has now grown to a rich collection of dynamic and interactive services. The tremendous growth of web has resulted into high demand for high bandwidth and delay in fetching user request (Neha, 2013). Users sometimes experience unpredictable delay while retrieving web pages from the server. Increase in bandwidth is a possible solution to the problem but it involves high economic cost. Web caching reduces the latency perceived by the user, reduces bandwidth utilization and reduces the loads on the origin servers (Pallis, 2007). Latency refers to the time elapsed from the time a request is sent to the time sender receives the requested information.

Many latency tolerant techniques have been developed over the years to solve this problem without necessarily increasing the bandwidth. Most notably are caching and prefetching. Web prefetching helps to fetch and cache users request during server idle time, which will reduce the load on the origin server. To reduce the access delay experienced by users, it is advisable to predict and prefetch web object based on user access patterns and cache them. Studies on web pre-fetching are mostly based on the history of user access patterns. If the history information shows an access pattern of URL address A followed B with a high probability, then B will be prefetched once A is accessed (Cheng-Zhong, 2000).....

For more Mathematics Projects click here
================================================================
Item Type: Project Material  |  Attribute: 60 pages  |  Chapters: 1-5
Format: MS Word  |  Price: N3,000  |  Delivery: Within 30Mins.
================================================================

Share:

No comments:

Post a Comment

Select Your Department

Featured Post

Reporting and discussing your findings

This page deals with the central part of the thesis, where you present the data that forms the basis of your investigation, shaped by the...

Followers