Nesting regsubex calls can be very useful for modifying parts of a subtext within certain boundaries of the original text. For example, imagine you want to change all a's and b's between X and Y pairs in a string into uppercase, while there can be multiple X-Y pairs, multiple a's and b's (in any order) and any other letters within the string. The approach would be to use two nested regsubex calls, having the outer one extract the text for replacement, and having the inner one perform the actual replacement. Doing all this on one line raises the issue: how to use fields like \t and \n in the <sub> field of the inner regsubex? Just specifying \t or \n doesn't work on the same line, and without those it's impossible to find out what's being replaced.
As it turns out, the problem is that mIRC does a search-and-replace on \t, \n (as well as \1, \2, \a etc) before evaluation, and therefore can't tell which \t and \n is part of any inner regsubex call. As a result, the \t and \n for the inner regsubex are replaced with the contents of the \t and \n of the outer regsubex, which obviously results in wrong output. The solution is to use the construction [[ \ $+ t ]] (and similar) in the <sub> field of the inner regsubex:
$regsubex(acaaXbaacababacaYacbbbbcccbaabbXbcccbaaacbbbcYabcac,/(X[^Y]*Y)/g,$regsubex(x,\t,/([ab])/g,$upper( [[ \ $+ t ]] )))
This, and only this results in the correct output:
acaaXBAAcABABAcAYacbbbbcccbaabbXBcccBAAAcBBBcYabcac
Putting the whole inner regsubex call into a custom identifier and calling $that_identifier(\t) from the outer regsubex's <sub> field works just as well. Remember to give the regsubex calls unique names (I used 'x' for the inner one and an empty name for the outer one here) or things will go horribly wrong :)
Tested on mIRC 6.21.
— Saturn 2007/04/18 17:20
If you're wondering why only the above construction works, it may help to re-read jaytea's evaluation brackets article, specifically the part about brackets inside identifiers. As mentioned there, [[ and ]] are processed at the same stage as single [ ], ie before any actual identifier/variable evaluation. To understand what comes next, let's first have a brief look at how mIRC evaluates an identifier:
In $regsubex(), the situation is slightly different; without knowing what's really going on behind the scenes, the process to us looks like this:
Now let's analyze Saturn's example. mIRC first turns [[ ]] into single [ ] and then starts evaluating the outer $regsubex's parameters. When it gets to the third parameter (ie <sub>), it first performs the aforementioned search-and-replace to \t, \n, \1, \2 etc 1). This replaces \t in the <text> parameter of the inner $regsubex with its value (or rather its internal representation). It then attempts evaluation of the inner $regsubex, ie spawns a new instance of its script evaluation procedure. But thanks to the previous step, that procedure is now called to evaluate
$regsubex(x,<value of outer \t>,/([ab])/g,$upper( [ \ $+ t ] ))
The usual rules apply here; [ ] are processed first and any code inside them is pre-evaluated, so \ $+ t becomes \t. After the [] pre-processing, mIRC proceeds with the search-and-replace of \t, \1 etc, so the \t that was just constructed is given its value; this value comes from the regex matches of the inner $regsubex, since we are now in its context.
One can thus see why only the double-brackets construction works; by controlling the evaluation order, we can have the second \t constructed after the search-and-replace of the outer $regsubex but before the search-and-replace of the inner $regsubex.
You noticed above that \t in the <text> parameter of the inner $regsubex works fine (evaluating to the match from the outer $regsubex). Can you use \t & co in the <sub> parameter of the inner $regsubex (to refer to the outer \t again)? The answer is no, and here's why: as we saw above, the <sub> parameter of $regsubex is evaluated after the regex match has been performed and the appropriate internal structures have been set. mIRC's search-and-replace replaces \t in the inner <sub> with its internal representation, hereafter called <t>. The problem is that <t> seems to be 'reset' by the inner $regsubex; in the context of the latter, <t> does not represent anything. The workaround is again [[ ]]; enclosing \t in them pre-evaluates its internal representation, so the inner $regsubex only sees the actual value of \t. A real-world (I needed such a functionality at one point) example is the following:
$regsubex(5-7 10 14-18 20 23-29,/(\d+)-(\d+)/g,$regsubex(x,$str(.,$calc(\2 - \1 + 1)),/./g,$calc( [[ \1 + \ $+ n ]] - 1) $chr(32)))
What this does is expand each number range in the input to a space-separated list of consecutive integers. For this to work, matches from the outer $regsubex need to be used in the <sub> parameter of the inner $regsubex.
The attentive reader will have noticed a problem with this; if \t's value includes commas, parentheses etc, a syntax error will occur, for the same reason
var %a = a,b | echo -ag $upper( [ %a ] )
generates an error. So unless you are sure that your input will not contain those special characters, you are better off with an alias.
— qwerty 2007/04/18 23:00