International Journal of Information and Communication Technology Research
مجله بین المللی ارتباطات و فناوری اطلاعات
International Journal of Information and Communication Technology Research
Engineering & Technology
http://ijict.itrc.ac.ir
1
admin
2251-6107
2783-4425
doi
1652
25391
en
jalali
1391
9
1
gregorian
2012
12
1
4
4
online
1
fulltext
fa
IECA: Intelligent Effective Crawling Algorithm for Web Pages
فناوری اطلاعات
Information Technology
پژوهشي
Research
<p>Obtaining important pages rapidly can be very useful when a crawler cannot visit the entire Webin a reasonable amount of time.Several Crawling algorithms such as Partial PageRank,Batch PageRank, OPIC, and FICA have been proposed, but they have high time complexity or low throughput. To overcome these problems, we propose a new crawling algorithm called IECA which is easy to implement with low time O(E*logV)and memory complexity O(V) -Vand Eare the number of nodes and edges in the Web graph, respectively. Unlike the mentioned algorithms, IECA traverses the Web graph only once and the importance of the Web pages is determined based on the logarithmic distance and weight of the incoming links. To evaluate IECA, we use threedifferent Web graphs such as the UK-2005, Web graph of university of California, Berkeley-2008, and Iran-2010. Experimental results show that our algorithm outperforms other crawling algorithms in discovering highly important pages.</p>
search engines, Web crawling, Web graph, logarithmic distance, reinforcement learning, World Wide Web
33
42
http://ijict.itrc.ac.ir/browse.php?a_code=A-10-27-144&slc_lang=fa&sid=1
Mohammad Amin
Golshani
1003194753284600483
1003194753284600483
Yes
Ali Mohammad
Zareh Bidoki
1003194753284600484
1003194753284600484
No