EPI-LI-0003

Title

Encoding detection

One line summary

Provide an encoding detection library that can be used to detect encoding of a given text flow.

Status

Not started

Description

We currently have the Eiffel encoding library with interface of encoding detector left, but missing the implementation. The idea is to implement various encoding detectors for general texts, Eiffel code, HTML, XML and so on.

Encoding detection has already some implementations used in browsers. It is relatively easier for text flow of known types, HTML, XML for example, which mostly have specified/default encoding. The difficulty is general text encoding detection. The idea is to look at what strategies other implementations are using and reuse it Eiffel.

Skills needed

Knowledge of encoding and encoding detection.

Difficulty

High

Benefits

Student will learn about the various encoding schemes used for files. He will learn about Eiffel and library design.

It will help the community by providing them solutions to read external data easily without having to worry much about the used encoding.

Licensing

EFLv2

Documentation

In the Eiffel source code as well as in http://docs.eiffel.com for the user code.

Submitter

Eiffel Software

Possible mentor

Ted Feng