DeriveIt's Web Content Filter: New Approach

New algorithm

Apply unique statistic and probabilistic methods to train filters to profile WebPages.

Using our experience with Bayesian Networks, classification techniques, probability and statistics, and proprietary heuristics, we have developed a filter engine that can create a category filter with as little as 10 - 50 websites for a particular category that has an accuracy rate of 95-99%, with a low over-blocking rate.

Dynamic web content filtering

Allow for adaptable dynamic content filtering.

Google states they have searched over 4,000,000,000 web pages, and this number is growing daily. Since most web filters currently available rely on blacklists, which based on product claims contain between 5 and 20 million websites, there is a large gap between the total number of websites and the largest blacklists in use.

No White/Blacklist maintenance

There are several problems with blacklists:

  • How can an accurate blacklist be maintained with the large number of websites available, and with that number growing?
  • How can a blacklist be accurate when the content of websites can change at any time?
  • How can the blacklist method be efficient if you have to lookup a website in a database with billions of URLs?
  • How can a blacklist be effective when a bypass proxy server is used?

System requirements

Microsoft's Internet Explorer (IE) 4.0 and above
Windows 95, 98, NT, ME, 2000, XP, 2003
Disk Space Usage - 3 MB

High accuracy rate

Overall accuracy rate: 99%

Category Training Tested Accuracy
Pornography 2714 good, 2399 bad URLs 2668 bad URLs blocking rate: 99%
Adult content 485 good, 52 bad URLs 50 bad URLs blocking rate: 98%
Gambling 485 good, 49 bad URLs 50 bad URLs blocking rate: 99%
Weapons 485 good. 97 bad URLs 50 bad URLs blocking rate: 97%
"Good" Content   2472 good URLs over-blocking rate: 1%

Features

  • Parental Override - Password control of filtering
  • Enable/Disable Content Filtering
  • Easy Installation - IE Plug-in
  • Compatible with firewalls, proxy servers, caching and other network devices

Product Usage