You can check if your web application is vulnerable to XSS UTF-7 vulnerabilities by trying the following attack vectors:
If your application reflects back executable script (showing the dialog XSS alert) to the browser after entering such attack vectors from user input it is vulnerable to UTF-7 encoded XSS. As you might know, reflected XSS can be used maliciously to steal confidential data stored on the victim browser such as cookies and confidential information.
So if your web application is vulnerable to XSS UTF-7 encoded attacks you should check the server side API that filter input data and make sure that do not actually fail in the implementation of basic XSS countermeasures such as:
1) use white list filtering (e.g. default deny except for safe characters)
2) perform output encoding of input strings to the HTML equivalent before sending back the output string to the browser. Example:< becomes < and > becomes >
Unfortunately, some APIs being developed in commercial web applications still use black listing with regex to strip off script tags such as <> and fail miserably when the attack vector is provided in encoded fashion such as UTF 7.
If your application filter API is actually doing black listing , you can try to fix it by changing the regular expressions and filter the UTF-7 input equivalents.
The problem I see with this approach is that soon enough another XSS script could enter unfiltered when using a different encoding and therefore exploit the vulnerability again. The main problem is that trying to black listing all XSS attack vectors is hard since is difficult to predict what is malicious (black list) rather then define what is benign (white listing)
Also do not forget to enforce the correct encoding through the metatags like in UTF 7 XSS vulnerability that where previously found in google (now being fixed) http://www.governmentsecurity.org/forum/lofiversion/index.php/t18105.html.
Actually, if you enforce the encoding to a different charset such as UTF-8 or ISO 8859-1 you will not be vulnerable to UTF-7 XSS because the browser will switch to a UTF-8 encoding and will not render the UTF-7 attack vector.
But, if you really would like to implement an effective XSS filtering API make sure to perform white list filtering and output encoding and test that the API does not fail to filter different encoded attack vectors such as the one listed on http://ha.ckers.org/xss.html.
My suggestion: do not to re-invent the wheel :) use APIs that have been tested and vetted by the security community. Some of these APIs (e.g. HTML purifier) have been already developed and tested using XSS attack vectors: http://htmlpurifier.org/live/smoketests/xssAttacks.php When using .NET look at Microsoft Anti-XSS library the version 1.5 is freely available from MSDN: http://msdn2.microsoft.com/en-us/security/aa973814.aspx XSS filtering APIs are also available to download from OWASP such as Anti-Samy and the Encoding APIs.
The OWASP Anti-Samy http://www.owasp.org/index.php/AntiSamy it's a library that parses and cleans HTML/CSS using a whitelist validation technique. It has been presented by Arshan Dabirsiaghi at 07 OWASP/WASC Appsec Conference in San Jose: www.owasp.org/images/e/e9/OWASP-WASCAppSec2007SanJose_AntiSamy.ppt The OWASP encoding library is functionally equivalent to the Microsoft AntiXSS has been developed for several languages http://www.owasp.org/index.php/Category:OWASP_Encoding_Project
More specifically for php you can also look at the OWASP PHP AntiXSS library