View previous topic :: View next topic |
Author |
Message |
tmp_question
Joined: 04 Aug 2004 Posts: 5
|
Posted: Wed Aug 04, 2004 Post subject: How to negate with regex's? |
|
|
Hi all,
how can I invert a filter, if I want all sites except ones where the URL contains a certain word?
For example, I want to block every image (or some other element) from "xyz.com", except the parts from "abc.xyz.com"? The filter "/[^abc]\.xyz\.com/" does not work because "^" can only be used with character classes, not on whole words like "abc". And something like "/(abc){0}\.xyz\.com/" or similar didn't work either.
I was not able to find a useful tutorial or a hint on how to solve that special filtering. Am I right in saying that this is not possible with the standard regular expressions that Adblock offers (whitelists don't exist yet)? Or is there some way to get what I need?
Thx for your help! |
|
Back to top |
|
 |
NJH
Joined: 13 Nov 2003 Posts: 183 Location: Hampshire, England
|
Posted: Wed Aug 04, 2004 Post subject: |
|
|
I've seen two ways of negating, but I cannot remember one. I have one filter /akamai\.net(?!.*but)/ which blocks akamai.net in every case except when it is followed somewhere by but because on one site I use the navigation buttons (ending in but) are supplied by akamai.net, but so is a lot of the advertising. The filter is a bit broad in its negating but it does the job. Perhaps you can do something like it. |
|
Back to top |
|
 |
tmp_question
Joined: 04 Aug 2004 Posts: 5
|
Posted: Wed Aug 04, 2004 Post subject: |
|
|
NJH wrote: | one filter /akamai\.net(?!.*but)/ which blocks akamai.net in every case except when it is followed somewhere by but [...] Perhaps you can do something like it. |
Unfortunately this does not work in my case. This is only possible for "look-ahead" when the exceptions I want to allow come after the other words (like "but" comes after "akamai" in the above example). Looking in the regex patterns I find no such way for words preceding the blocked "main" URL.
Thx anyway for your help! |
|
Back to top |
|
 |
NJH
Joined: 13 Nov 2003 Posts: 183 Location: Hampshire, England
|
Posted: Wed Aug 04, 2004 Post subject: |
|
|
Have a look at this thread and good luck. |
|
Back to top |
|
 |
tmp_question
Joined: 04 Aug 2004 Posts: 5
|
Posted: Thu Aug 05, 2004 Post subject: |
|
|
NJH wrote: | Have a look at this thread and good luck. |
Thanks. I've read it carefully, but it doesn't help - it contains the basics of regexps but nothing that is useful in my case. The problem is that I don't need a "look-ahead", what I need is kind of a "look-behind". And according to the tutorial pages, the regexps of Javascript are quite crippled and don't support that. And as far as I understand they also don't allow an inversion of some string or matching expression (like Perl does), which would also help me to get the "whitelist" effect I want.
Are there any other suggestions, can someone give me regexp for the example above (filtering everything from "xyz.com" except content from "abc.xyz.com")? I'm I just blind or is this really not possible - I'd be rather diappointed by Adblock then...
Thx! |
|
Back to top |
|
 |
kstahl Support
Joined: 02 Jan 2004 Posts: 1202 Location: Stockholm, Sweden
|
Posted: Thu Aug 05, 2004 Post subject: |
|
|
If what NJH told you won't help, I don't think it's possible.
The next version of Adblock will however include whitelist functionality. That should solve your problem. _________________ Adblock 0.5.3.042
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1. Gecko/20051111 Firefox/1.5 |
|
Back to top |
|
 |
tmp_question
Joined: 04 Aug 2004 Posts: 5
|
Posted: Sat Aug 07, 2004 Post subject: |
|
|
kstahl wrote: | If what NJH told you won't help, I don't think it's possible. |
:-(
However, I'm quite astonished that I'm the only one who demanded that feature so far and no other user of Adblock tried the same yet.
kstahl wrote: | The next version of Adblock will however include whitelist functionality. That should solve your problem. |
Perhaps, it depends on the power of these whitelists and the way they get implemented. One needs to apply them in a certain order for them to work like I want it. In my example I have to (1) blacklist all from "xyz.com", (2) override that with a whitelist allowing everything from "abc.xyz.com" and (3)...(?) to block again certain contents from "abc.xyz.com" (imges, ads, banners) as usual.
If "whitelist" only allows step (2) and just means to "allow everything from 'abc.xyz.com'" without giving me the possibility to finetune it and override that whitelist again with some regexps/entries, I have the same problem again...
Are there yet any details known about the whitelist-functionality in the upcoming Adblock versions? And when will they be available
Thx! |
|
Back to top |
|
 |
kstahl Support
Joined: 02 Jan 2004 Posts: 1202 Location: Stockholm, Sweden
|
Posted: Sat Aug 07, 2004 Post subject: |
|
|
Well, if it's not supported in the RegEx engine, then I don't see how it could be done.
You are definitely not the first person to ask for some kind of whitelist feature, that's for sure. And whitelists, of some sorts, should be included in the next version. Maybe it will be possible to do what you want. The devs are very tight lipped when it comes to the new features. We shall find out in a couple of weeks time, I guess.
Until then, it sounds to me like you should create more specific filers instead of trying to block entire domains. Which page is it you have problems with? _________________ Adblock 0.5.3.042
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1. Gecko/20051111 Firefox/1.5 |
|
Back to top |
|
 |
tmp_question
Joined: 04 Aug 2004 Posts: 5
|
Posted: Tue Aug 10, 2004 Post subject: |
|
|
kstahl wrote: | You are definitely not the first person to ask for some kind of whitelist feature, that's for sure. And whitelists, of some sorts, should be included in the next version. Maybe it will be possible to do what you want. The devs are very tight lipped when it comes to the new features. We shall find out in a couple of weeks time, I guess. |
Okay, I guess I have to wait then...
kstahl wrote: | Until then, it sounds to me like you should create more specific filers instead of trying to block entire domains. Which page is it you have problems with? |
There are some, but a good example is the german news page 'focus.de': they are hosted by MSN. So I want to block everything which is from 'msn.de' or 'msn.com' directly (images, ads), except the content from '.*\.focus\.msn\.de' (where I want to see everything including the images and photos belonging to the different articles). But even here I want to block (with a regex) everything that seems to be an ad or banner.
So far I have not been able to create a filter that supports all that and neither leaves to much content nor removes something I definitely want to see. But perhaps you have a better idea?
Thx! |
|
Back to top |
|
 |
*HappyCamper*
Joined: 13 Aug 2004 Posts: 2
|
Posted: Fri Aug 13, 2004 Post subject: Negation |
|
|
Bearing in mind my limited knowledge, I believe that negation is possible (wrong! see below). I used something like this:
Code: | /(!?theonion.com{0,0})[rest of filter]/ |
to block images based in xxx by xxx filenames while retaining the images on theonion.com. So as far as I can tell, anything captured by {0,0} must occur no more than or no less than zero times for the filter to apply. The resident gurus might like to comment.
EDIT: Added "!?". Oops. My testing using a checker showed that
matches "red" but not "blue/red". Hope this is useful. Could it be added to regexp tutorial info?
EDIT: Responses below indicate rightly that this isn't possible. Fooled myself on a late night. Sorry 'bout any confusion. _________________ Who are you? You are not Orz! We are Orz! Orz are happy *people energy* from the outside.
Last edited by *HappyCamper* on Sat Aug 14, 2004; edited 1 time in total |
|
Back to top |
|
 |
NJH
Joined: 13 Nov 2003 Posts: 183 Location: Hampshire, England
|
Posted: Fri Aug 13, 2004 Post subject: |
|
|
HappyCamper,
I cannot get your filter (without the /'s) to match "red" on your test site. Are you sure it is correct or am I missing something?
Nick |
|
Back to top |
|
 |
Org
Joined: 23 Oct 2003 Posts: 349
|
Posted: Fri Aug 13, 2004 Post subject: |
|
|
Same here, NJH. That syntax didn't work for me in that test site, and it didn't work in Adblock filters. Could it be dependant on browser version? HappyCamper, what do you use? |
|
Back to top |
|
 |
rue Developer
Joined: 22 Oct 2003 Posts: 752
|
Posted: Fri Aug 13, 2004 Post subject: |
|
|
He said "I use something like.." -- which is almost certainly true. However, zero-quantifiers ({0,0}) are not allowed; and the delimiter-syntax for lookahead was reversed.
.
Here's a functioning reconstruct: /(?!blue)red/
.
I take that back. There is no flexible way to craft lookbehind in mozilla. |
|
Back to top |
|
 |
*HappyCamper*
Joined: 13 Aug 2004 Posts: 2
|
Posted: Sat Aug 14, 2004 Post subject: Sorry, my mistake |
|
|
Thanks for clearing that up rue. Must have fooled myself. Regexps and late nights don't combine well for me. Looking forward to the whitelist feature. _________________ Who are you? You are not Orz! We are Orz! Orz are happy *people energy* from the outside. |
|
Back to top |
|
 |
|