Your idea sounds feasible. Are you going to use a real-world dataset? Converting a dataset mined or downloaded from the Internet to the format that you want is usually non-trivial, which can also be counted as "something extra".