The Adblock Project Forum Index The Adblock Project
Pull up a seat ...stay a while.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Performance issues?

 
Post new topic   Reply to topic    The Adblock Project Forum Index -> Main
View previous topic :: View next topic  
Author Message
johann_p



Joined: 14 Aug 2004
Posts: 4

PostPosted: Sat Aug 14, 2004    Post subject: Performance issues? Reply with quote

Is there data or experience available how the number of patterns affects the performance? I would assume that dozens or even hundreds of regular expressions will have quite an impact on loading a page with many images or iframes?

From what I saw from a quick glance at the sourcecode, the extensions simply loops through all patterns for each url - is this correct?

The most efficient method for doing this would probably be to compile all the patterns into a finite matching automata at program startup or when the pattern list is changed and use that automata with with each URL - especiall if these functions were implemted in C++ and accessed from javascript via XPCOM.

Since generic effecient matching against a long list of patterns is something that could be useful for the base app and for many extensions, this could be a worthwile addition.

Just thinking aloud here - adblock is a fine extensions as it is and I am glad it exists Smile
Back to top
View user's profile Send private message
johann_p



Joined: 14 Aug 2004
Posts: 4

PostPosted: Mon Aug 16, 2004    Post subject: Reply with quote

Do developers read this forum or is it the wrong place to ask such things?

Does anyoneelse have experiences about what the perfomance impact of a using a list with say, 100 patterns is?
Back to top
View user's profile Send private message
wonkothesane
The Other Developer


Joined: 22 May 2004
Posts: 210

PostPosted: Mon Aug 16, 2004    Post subject: Reply with quote

Essentially, what we have now is Fast Enough. It could certainly be made faster, but one of the key ideas with having regular expressions is that you don't need 100 different filters to be able to accurately target undesired content. Performance is also going to be variable across the variety of machines on which Adblock is being run.

Regarding finite matching automata (aka finite automata, aka finite state machine): it's been done for us -- the core engine of most, if not all, regular expression implementations is essentially a form of FSM. Languages like Python require you to explicitly compile your regular expressions from the text pattern into a usable regular expression; JavaScript handles compilation for you on object creation.

A framework for generic matching is a good idea in theory, but The Right Way to do it would be a compiled XPCOM module for small size & speed, but that has its own set of issues -- it's not terribly easy to get started compiling XPCOM components (plus my C++ is rather rusty), and I don't know what would happen if two different extensions registered the same component... and to top it off, I can't really think of any extensions that would actually make much use of such a generic framework.

As Knuth said, "premature optimization is the root of all evil in programming".
Reinventing the wheel comes next.
Back to top
View user's profile Send private message Send e-mail Visit poster's website AIM Address
johann_p



Joined: 14 Aug 2004
Posts: 4

PostPosted: Tue Aug 17, 2004    Post subject: Reply with quote

Thank you for the answer. I was just curious about how lists of 100s of regular expressions would impact performance on pages with many links, since each of the expression needs to get compiled and matched against each link.

I know that regexps are handled internally by getting compiled into FAs but the idea was to compile the whole list of all regexps into a single matcher once at startup/list change time. I do not know if it is possible to explicitly precompile regexps in javascript but if yes, this would be equivalent to converting the list into a large or-connected expression and compiling it and subsequently only use that for each match.

Again, this is not to be taken as critizism, just interested in your opinion Smile
Back to top
View user's profile Send private message
rue
Developer


Joined: 22 Oct 2003
Posts: 752

PostPosted: Tue Aug 17, 2004    Post subject: Reply with quote

johann_p:
wonko's reply strayed slightly offtopic.
.
Yes, that's a potential optimization, and we had considered it. Naturally, we'll need to run tests to plot performance-gain against the trouble to maintain.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    The Adblock Project Forum Index -> Main All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group