kmail
EncodingDetector Class Reference
#include <encodingdetector.h>
Detailed Description
Provides encoding detection capabilities.Guess encoding of char array.Searches for encoding declaration inside raw data -- meta and xml tags. In the case it can't find it, uses heuristics for specified language.
If it finds unicode BOM marks, it changes encoding regardless of what the user has told
Intended lifetime of the object: one instance per document.
Typical use:
QByteArray data; ... EncodingDetector detector; detector.setAutoDetectLanguage(EncodingDetector::Cyrillic); QString out=detector.decode(data);
Do not mix decode() with decodeWithBuffering()
Definition at line 57 of file encodingdetector.h.
Public Types | |
enum | EncodingChoiceSource { DefaultEncoding, AutoDetectedEncoding, BOM, EncodingFromXMLHeader, EncodingFromMetaTag, EncodingFromHTTPHeader, UserChosenEncoding } |
enum | AutoDetectScript { None, SemiautomaticDetection, Arabic, Baltic, CentralEuropean, ChineseSimplified, ChineseTraditional, Cyrillic, Greek, Hebrew, Japanese, Korean, NorthernSaami, SouthEasternEurope, Thai, Turkish, Unicode, WesternEuropean } |
Public Member Functions | |
EncodingDetector () | |
EncodingDetector (QTextCodec *codec, EncodingChoiceSource source, AutoDetectScript script=None) | |
~EncodingDetector () | |
bool | setEncoding (const char *encoding, EncodingChoiceSource type) |
const char * | encoding () const |
bool | visuallyOrdered () const |
void | setAutoDetectLanguage (AutoDetectScript) |
AutoDetectScript | autoDetectLanguage () const |
EncodingChoiceSource | encodingChoiceSource () const |
bool | analyze (const char *data, int len) |
bool | analyze (const QByteArray &data) |
Static Public Member Functions | |
static AutoDetectScript | scriptForName (const QString &lang) |
static QString | nameForScript (AutoDetectScript) |
static AutoDetectScript | scriptForLanguageCode (const QString &lang) |
static bool | hasAutoDetectionForScript (AutoDetectScript) |
Protected Member Functions | |
bool | errorsIfUtf8 (const char *data, int length) |
QTextDecoder * | decoder () |
Constructor & Destructor Documentation
EncodingDetector::EncodingDetector | ( | ) |
Default codec is latin1 (as html spec says), EncodingChoiceSource is default, AutoDetectScript=Semiautomatic.
Definition at line 877 of file encodingdetector.cpp.
EncodingDetector::EncodingDetector | ( | QTextCodec * | codec, | |
EncodingChoiceSource | source, | |||
AutoDetectScript | script = None | |||
) |
Allows to set Default codec, EncodingChoiceSource, AutoDetectScript.
Definition at line 881 of file encodingdetector.cpp.
Member Function Documentation
bool EncodingDetector::setEncoding | ( | const char * | encoding, | |
EncodingChoiceSource | type | |||
) |
- Returns:
- true if specified encoding was recognized
Definition at line 926 of file encodingdetector.cpp.
const char * EncodingDetector::encoding | ( | ) | const |
Convenience method.
- Returns:
- mime name of detected encoding
Definition at line 905 of file encodingdetector.cpp.
bool EncodingDetector::analyze | ( | const char * | data, | |
int | len | |||
) |
Analyze text data.
- Returns:
- true if there was enough data for accurate detection
Definition at line 986 of file encodingdetector.cpp.
bool EncodingDetector::analyze | ( | const QByteArray & | data | ) |
Analyze text data.
- Returns:
- true if there was enough data for accurate detection
Definition at line 981 of file encodingdetector.cpp.
EncodingDetector::AutoDetectScript EncodingDetector::scriptForName | ( | const QString & | lang | ) | [static] |
bool EncodingDetector::errorsIfUtf8 | ( | const char * | data, | |
int | length | |||
) | [protected] |
Check if we are really utf8.
Taken from kate
- Returns:
- true if current encoding is utf8 and the text cannot be in this encoding
Definition at line 813 of file encodingdetector.cpp.
QTextDecoder * EncodingDetector::decoder | ( | ) | [protected] |
The documentation for this class was generated from the following files: