Say you needed to find ocurrences of the word raktapaṭa-, lit. “red-robe [wearer]” (a perhaps not-so-polite name for Buddhist monks). Of course you could check Monier-Williams' Sanskrit-English dictionary and the St. Petersburg dictionary. BBEdit does a nice job of searching the digital version. Use incremental search (=“Quick Search”, alt-cmd-f) to swiftly find <raktapaṭa>. The dictionary will then show you something like the following window: | |
This is all very well, but there are only four references given. Worse, the identification of raktapaṭa with sāṃkhyabhikṣu seems more like a corruption for śākyabhikṣu. A quick check shows that MW gives no attestation for this form. What I really want to do is search all of my etexts to check ocurrences of raktapaṭa. It would of course be wrong to do a multifile search for raktapaṭa because in sandhi one might expect to find the final “a” changed. So, if I just search for raktapaṭ I should avoid the final sandhi problem. This would give me already quite a few hits and I might be tempted to leave it at that. But there is more. I really ought to take into account close synomyns of raktapaṭa. “Red-robe [wearer]” is a bahuvrīhi compound made up of two members each of which has many synonyms. What I really would like to search for in my etexts is “any word meaning red” followed by “any word meaning garment or cloth”. This is where GREP comes in. Lets select only a few synonyms for this example: rakta, aruṇa, piṅgala for “red”, and paṭa, vāsa, ambara for “garment”. First we need to allow for alternation. This is easy, in GREP syntax the alternation operator is “|” so (cat|dog|mouse) means “cat” or “dog” or “mouse”. The braces serve to group it together so that it is treated as a unit. Next we need to worry about sandhi. Rakta followed by ambara becomes raktāmbara. For this problem we need to allow a “range”, or “set”, of characters to follow rakt.... Actually, here only two are necessary, either short “a” or long “ā”. GREP uses square brackets for this: [abc] means either “a” or “b” or “c”. Here a further problem arises. In some etexts sandhi has been dissolved, in some it has not. Also, dissolved sandhi is marked by different characters: say a hyphen “-” or a plus sign “+”. So we need to be able to allow for a hyphen etc. to be there or not. This can be done by adding the GREP operator for “zero or one occurrences”, a question mark “?”. So the GREP pattern: [-+]? means “zero or one of either a hyphen or a plus sign. Putting this all together, we can switch on the GREP option in the BBEdit search window (cmd-g) enter the following search string and search through all of our etexts: | |