The Adblock Project Forum Index The Adblock Project
Pull up a seat ...stay a while.
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

How to negate with regex's?

 
Post new topic   Reply to topic    The Adblock Project Forum Index -> Main
View previous topic :: View next topic  
Author Message
tmp_question



Joined: 04 Aug 2004
Posts: 5

PostPosted: Wed Aug 04, 2004    Post subject: How to negate with regex's? Reply with quote

Hi all,

how can I invert a filter, if I want all sites except ones where the URL contains a certain word?
For example, I want to block every image (or some other element) from "xyz.com", except the parts from "abc.xyz.com"? The filter "/[^abc]\.xyz\.com/" does not work because "^" can only be used with character classes, not on whole words like "abc". And something like "/(abc){0}\.xyz\.com/" or similar didn't work either.
I was not able to find a useful tutorial or a hint on how to solve that special filtering. Am I right in saying that this is not possible with the standard regular expressions that Adblock offers (whitelists don't exist yet)? Or is there some way to get what I need?
Thx for your help!
Back to top
View user's profile Send private message
NJH



Joined: 13 Nov 2003
Posts: 183
Location: Hampshire, England

PostPosted: Wed Aug 04, 2004    Post subject: Reply with quote

I've seen two ways of negating, but I cannot remember one. I have one filter /akamai\.net(?!.*but)/ which blocks akamai.net in every case except when it is followed somewhere by but because on one site I use the navigation buttons (ending in but) are supplied by akamai.net, but so is a lot of the advertising. The filter is a bit broad in its negating but it does the job. Perhaps you can do something like it.
Back to top
View user's profile Send private message
tmp_question



Joined: 04 Aug 2004
Posts: 5

PostPosted: Wed Aug 04, 2004    Post subject: Reply with quote

NJH wrote:
one filter /akamai\.net(?!.*but)/ which blocks akamai.net in every case except when it is followed somewhere by but [...] Perhaps you can do something like it.


Unfortunately this does not work in my case. This is only possible for "look-ahead" when the exceptions I want to allow come after the other words (like "but" comes after "akamai" in the above example). Looking in the regex patterns I find no such way for words preceding the blocked "main" URL.
Thx anyway for your help!
Back to top
View user's profile Send private message
NJH



Joined: 13 Nov 2003
Posts: 183
Location: Hampshire, England

PostPosted: Wed Aug 04, 2004    Post subject: Reply with quote

Have a look at this thread and good luck.
Back to top
View user's profile Send private message
tmp_question



Joined: 04 Aug 2004
Posts: 5

PostPosted: Thu Aug 05, 2004    Post subject: Reply with quote

NJH wrote:
Have a look at this thread and good luck.


Thanks. I've read it carefully, but it doesn't help - it contains the basics of regexps but nothing that is useful in my case. The problem is that I don't need a "look-ahead", what I need is kind of a "look-behind". And according to the tutorial pages, the regexps of Javascript are quite crippled and don't support that. And as far as I understand they also don't allow an inversion of some string or matching expression (like Perl does), which would also help me to get the "whitelist" effect I want.
Are there any other suggestions, can someone give me regexp for the example above (filtering everything from "xyz.com" except content from "abc.xyz.com")? I'm I just blind or is this really not possible - I'd be rather diappointed by Adblock then...
Thx!
Back to top
View user's profile Send private message
kstahl
Support


Joined: 02 Jan 2004
Posts: 1202
Location: Stockholm, Sweden

PostPosted: Thu Aug 05, 2004    Post subject: Reply with quote

If what NJH told you won't help, I don't think it's possible.

The next version of Adblock will however include whitelist functionality. That should solve your problem.
_________________
Adblock 0.5.3.042
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.Cool Gecko/20051111 Firefox/1.5
Back to top
View user's profile Send private message
tmp_question



Joined: 04 Aug 2004
Posts: 5

PostPosted: Sat Aug 07, 2004    Post subject: Reply with quote

kstahl wrote:
If what NJH told you won't help, I don't think it's possible.


:-(
However, I'm quite astonished that I'm the only one who demanded that feature so far and no other user of Adblock tried the same yet.

kstahl wrote:
The next version of Adblock will however include whitelist functionality. That should solve your problem.


Perhaps, it depends on the power of these whitelists and the way they get implemented. One needs to apply them in a certain order for them to work like I want it. In my example I have to (1) blacklist all from "xyz.com", (2) override that with a whitelist allowing everything from "abc.xyz.com" and (3)...(?) to block again certain contents from "abc.xyz.com" (imges, ads, banners) as usual.
If "whitelist" only allows step (2) and just means to "allow everything from 'abc.xyz.com'" without giving me the possibility to finetune it and override that whitelist again with some regexps/entries, I have the same problem again...
Are there yet any details known about the whitelist-functionality in the upcoming Adblock versions? And when will they be available
Thx!
Back to top
View user's profile Send private message
kstahl
Support


Joined: 02 Jan 2004
Posts: 1202
Location: Stockholm, Sweden

PostPosted: Sat Aug 07, 2004    Post subject: Reply with quote

Well, if it's not supported in the RegEx engine, then I don't see how it could be done.

You are definitely not the first person to ask for some kind of whitelist feature, that's for sure. And whitelists, of some sorts, should be included in the next version. Maybe it will be possible to do what you want. The devs are very tight lipped when it comes to the new features. We shall find out in a couple of weeks time, I guess.

Until then, it sounds to me like you should create more specific filers instead of trying to block entire domains. Which page is it you have problems with?
_________________
Adblock 0.5.3.042
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.Cool Gecko/20051111 Firefox/1.5
Back to top
View user's profile Send private message
tmp_question



Joined: 04 Aug 2004
Posts: 5

PostPosted: Tue Aug 10, 2004    Post subject: Reply with quote

kstahl wrote:
You are definitely not the first person to ask for some kind of whitelist feature, that's for sure. And whitelists, of some sorts, should be included in the next version. Maybe it will be possible to do what you want. The devs are very tight lipped when it comes to the new features. We shall find out in a couple of weeks time, I guess.


Okay, I guess I have to wait then...

kstahl wrote:
Until then, it sounds to me like you should create more specific filers instead of trying to block entire domains. Which page is it you have problems with?


There are some, but a good example is the german news page 'focus.de': they are hosted by MSN. So I want to block everything which is from 'msn.de' or 'msn.com' directly (images, ads), except the content from '.*\.focus\.msn\.de' (where I want to see everything including the images and photos belonging to the different articles). But even here I want to block (with a regex) everything that seems to be an ad or banner.
So far I have not been able to create a filter that supports all that and neither leaves to much content nor removes something I definitely want to see. But perhaps you have a better idea?
Thx!
Back to top
View user's profile Send private message
*HappyCamper*



Joined: 13 Aug 2004
Posts: 2

PostPosted: Fri Aug 13, 2004    Post subject: Negation Reply with quote

Bearing in mind my limited knowledge, I believe that negation is possible (wrong! see below). I used something like this:

Code:
/(!?theonion.com{0,0})[rest of filter]/


to block images based in xxx by xxx filenames while retaining the images on theonion.com. So as far as I can tell, anything captured by {0,0} must occur no more than or no less than zero times for the filter to apply. The resident gurus might like to comment.

EDIT: Added "!?". Oops. My testing using a checker showed that
Code:
/(!?blue{0,0})red/

matches "red" but not "blue/red". Hope this is useful. Could it be added to regexp tutorial info?

EDIT: Responses below indicate rightly that this isn't possible. Fooled myself on a late night. Sorry 'bout any confusion.
_________________
Who are you? You are not Orz! We are Orz! Orz are happy *people energy* from the outside.


Last edited by *HappyCamper* on Sat Aug 14, 2004; edited 1 time in total
Back to top
View user's profile Send private message
NJH



Joined: 13 Nov 2003
Posts: 183
Location: Hampshire, England

PostPosted: Fri Aug 13, 2004    Post subject: Reply with quote

HappyCamper,

I cannot get your filter (without the /'s) to match "red" on your test site. Are you sure it is correct or am I missing something?

Nick
Back to top
View user's profile Send private message
Org



Joined: 23 Oct 2003
Posts: 349

PostPosted: Fri Aug 13, 2004    Post subject: Reply with quote

Same here, NJH. That syntax didn't work for me in that test site, and it didn't work in Adblock filters. Could it be dependant on browser version? HappyCamper, what do you use?
Back to top
View user's profile Send private message
rue
Developer


Joined: 22 Oct 2003
Posts: 752

PostPosted: Fri Aug 13, 2004    Post subject: Reply with quote

He said "I use something like.." -- which is almost certainly true. However, zero-quantifiers ({0,0}) are not allowed; and the delimiter-syntax for lookahead was reversed.
.
Here's a functioning reconstruct: /(?!blue)red/
.
I take that back. There is no flexible way to craft lookbehind in mozilla.
Back to top
View user's profile Send private message
*HappyCamper*



Joined: 13 Aug 2004
Posts: 2

PostPosted: Sat Aug 14, 2004    Post subject: Sorry, my mistake Reply with quote

Thanks for clearing that up rue. Must have fooled myself. Regexps and late nights don't combine well for me. Looking forward to the whitelist feature.
_________________
Who are you? You are not Orz! We are Orz! Orz are happy *people energy* from the outside.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    The Adblock Project Forum Index -> Main All times are GMT + 1 Hour
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group