Banner Image Banner Image

Web Content Crawler

Challenge

In order to get full advantage of the current business system and to get quick and easy access to all the relevant information, client has decided to develop and launch "Web Content Crawler".

"Web Content Crawler" has a wide range of goals and objectives out of which the prime considerations are listed below:

  • Devising as information retrieval system that can extract the useful information as much as possible and with the minimum amount of time.
  • Minimizing the overhead of a user for locating the needed and useful information.
  • Quick and easy access to high-quality documents to provide required and relevant information to the user within an acceptable time.
  • Information to be displayed in order of their relevance with respect to the user query.
  • Facilitates updates of information repository as frequently as possible with the web content's actual change of frequency.

Expertise

  • Technology

    Microsoft Office, C#

  • Database

    Microsoft SQL Server

  • Web Server

    Microsoft Office SharePoint Server

Solution

TatvaSoft provided a solution developed in SharePoint that was extremely useful and was intended to provide quick and instant access to information with features listed below:

  • Web Crawling: Crawling the specified URL and retrieves the current results from the web pages.
  • Execute Pure RSS Search: it will fetch and display search results from RSS search. User can select a feed to retrieve the needed information by searching that RSS feed. User can also reload the feed if required.
  • Execute Pure Facebook and Twitter Search: Facilitated crawling of data from Facebook and Twitter using specific URLs.
  • Configuration of application: It allows creating a new Facebook app. The user can also manage details of the application by updating the application specific information.
  • Custom Views of Content Pool: It provides an option to view items at all, published and unpublished items.
  • View Favorite posts: It will show a list of all the favorite posts of the user logged into the system.
  • Managing Search and Feeds :
    • Pure RSS Feed: It allows the user to configure pure RSS feed that includes title, feed URL, a category for the RSS feed, making feed enable for publishing.
    • Pure Facebook and Twitter: It facilitates user to manage items for pure Facebook and Twitter that includes title, category, username, crawl type and publishing the feed.
  • Country Exchange Rate: It manages exchange rates of different countries.
    • Base Currency: It allows the user to configure default value for the base currency. This value is saved on online interface developed in SharePoint when user will save crawled currency values.
    • Manage Currency Codes: This allows the user to configure new country for Exchange Rate by providing the country name and currency code.
    • Manage Exchange Rate: This feature will provide a list of Exchange Rates set for different countries. Also, the user can set and manage Exchange Rates of countries.

Result

The client has derived many business benefits from its association with TatvaSoft. This includes configuration of search to target precisely what is needed by ignoring certain types of references or sites. Continuous monitoring of web for search areas specified and returning results as they are uncovered, providing access to information in localities with limited or restricted web access, reducing network traffic and monitoring the change of the relevant areas of web space.

Related Case Studies