Source Management and Discovery

This page is dedicated to talks and ideas regarding how Eiffel developers can discover code and libraries they might not be aware exists in order to achieve a goal, without accidentally reinventing the wheel.

The problem of any large software project with a score of engineers is that one developer write some utility class, framework or library and is never used by anyone else but the developer (well, maybe a select few others by the passing of information by word of mouth.) Libraries and frameworks this tends not to be so much of an issue, but utility classes often get replicated or engineers frustrated at the lack of support that is actual there, they just don't know where to find it.

The problem become exacerbated using a decentralized office environment. The EiffelStudio production team has more members working remotely that in any of its offices, as has been the natural progression of things. When centralized engineers are free to talk with one another or exchange recent developments unabated, freely and openly. When decentralized, communication is lessened due to distant barriers and time-zones.

The EiffelStudio source repository is not exempt example of this. There are cases of repeat implementation, some better than others. It is unreasonable to expect every engineer to have intimate knowledge of a large project, and keep up to date with every addition and modification from all other engineers.

What is needed is some system put in place where one engineer can "tag" classes containing certain functionality, and another search for functionality.

A Quick Idea

This is an off-the-top =-of-my-head idea, so it probably holds little value in any mechanism but it illustrates a point.

Engineer A (lets call him EA) requires some functionality to iterate directories to attain a collection of files. Knowing what's available he know of nothing that suits his needs. EA sets out to implement the desired functionality (in this case to use regular expression for pattern matching) and once done, and tested, commit to the repository. Before committing EA annotates the new implementation with tags, in the note clause, so any other would-be consumer can locate it.

note:
  keywords: file(s), director(y|ies), file system, iteration, scanning, regex, regular, expression(s)
 
class
  FS_UTILITY
 
...
 
end

A year later EA is no longer with the company as he went off to peruse a dream serving cocktails in a straw hut of the beach, and is loving it. In short, he's not coming back. Engineer B (let call him EB) join the team after EA left and now want to scan a directory to retrieve a list of files. He's found functionality in EiffelBase and Gobo but it's not quite what he needs, EB needs pattern matching facilities. Overwhelmed by the idea of trudging through hundreds, maybe thousands of classes, the task is simple enough so EB implements a utility class. Now there's two classes with similar functionality. EA's implementation is better and more flexibly and EB's is limited but get the job done. The end product is a two-point source of failure for potential bugs, waste of a time resource, more code to compile and a larger end product binary.

Changing the story, to include a tag system, EB needs to scan a directory for a files using regular expression for pattern matching. EB fires up the class-query tool and enters files and directory. He's got the results for EiffelBase, Gobo and a class called FS_UTILITY. Intrigued EB looks at all three source and find FS_UTILITY best matches what he needs. EB could have actually gone one step further and searched for all files matching tags </code>files, directory and regex</code>, resulting in a single match, alimenting the search.

If the keywords were implemented on a routine's note clause then a suggestion could be made for the specific feature use. With both class and routine applications finding the API feature in amongst millions of line of Eiffel code would save time, save wasted resources, provide a unique code browsing facility and promote code reuse.

Extending to On-Line

The quick idea presented above makes the assumption that everything be locally available, locally in the sense of available on a system disk or a network source. With a on-line library management system it's quite possible to have indexes build for libraries available from an on-line library repository, downloadable at the users request. Searching for functionality would be a snap.

Self Populating Keyword Tools

Authoring keyword can be somewhat time consuming because a class author now needs to think of applicable keywords, but worth the effort. Given that functionality can be inferred from a routine name, tools can be used (EiffelStudio extended) to support the automatic generation of keywords. The tool could even go as far as examining code's comments and note clauses to weigh which keywords are most applicable.