ProposalProjectFiles

Revision as of 22:57, 29 May 2006 by Zoran (Talk | contribs) (Pseudo-code algorithm)

Rationale

Currently, 5.7 offers the following status on project files:

  • a .ecf file describes the Eiffel system to compile, without any compiled-project specific info (see Configuration)
  • .ecf files used to be called .acex, but this has been abandoned in favor of the .ecf extension
  • compiled-project specific info is stored in a .user file in the same folder as the .ecf
  • the .user file resides in the same folder as the corresponding .ecf
  • the .user file is a stored object of type USER_OPTIONS
  • ConfigurationMigration page describes how .ace files are migrated to the new .ecf format, and gives some info on the compiled-project files structure
  • a folder called EIFGENs holds all the files representing a compiled project
  • the .user file contains the path to the EIFGENs folder
  • it is possible to have the .ecf and .user files in another folder than the one containing the EIFGENs (but it is not possible to share the same .user file for different users and machines)


There are some limitations to the current implementation

  • the .ecf file can not be used from a network share (the .user file will contain info relative to the machine that performed the compilation, like a path to the EIFGENs folder that is meaningful only from the machine that created the project)
  • if the .ecf file is not in the same folder as the EIFGENs, the .user file bounds it to that folder anyway, we might as well then fix the location of the EIFGENs folder and have it necessarily be in the same folder as the .ecf
  • the .user file could contain more info allowing a more coherent management of Eiffel projects, the following scenarios are not handled correctly right now:
    • a group of developers wants to share the same .ecf file, but have different compilation folders
    • if a user compiles a project with environment variable GOBO defined for example, then closes the project, reopens it but forgets to define GOBO this time around, the compiler fails with an obscure error stating that "/library/kernel" is missing, the user has to guess that GOBO is not defined...
    • opening the same project several times at once usually leads to project corruption, and there is no way to know whether a project is already open or not

Specification

  • It should be possible to open any project from a file (by double-clicking on the file for example)
  • Opening an already compiled project should restore the environment as it was when project was compiled (as much as possible, for coherence)
  • It should be possible to instruct the compiler to recompile a project from scratch using that same file
  • It should be possible for third party tools (other than estudio compiler itself) to work with that project description file (why not, many people could contribute tools that work with estudio projects)
  • We consider that the situation of opening a project from estudio itself is covered no matter what we choose (estudio will always show an "open project" dialog that will list previously compiled projects and will allow to create a new project using a wizard)
  • The interesting bits are for "batch" or "automated" project creation and compilation
  • It should be possible to have different targets of the same project compiled with different versions of estudio, situations in which this is useful:
    • one wants to compile a 'win32' and 'win64' version of the same project
    • one wants to compile a 'release' and 'experimental' version of the same project using 2 different estudio deliveries

Design

  • a .ecf file describes an Eiffel system in general (it's equivalent to an Eiffel class)
  • a .ecp file is also generated by the compiler holding all the other project specific info (it's equivalent to an Eiffel object - an instance of a .ecf 'class' mentioned above)
    • the .ecp file contains all the info currently in the .user file, and more
    • the .ecp file can be used to compile a project from scratch etc.

Project structure

  • .ecf file can be located anywhere, can be shared amongst users, machines, estudio versions etc.
  • .ecp file can be located anywhere (but will reside in the folder containing the EIFGENs most of the time)
  • .ecp file is not supposed to be shared, it's specific to a machine and to a user
  • an EIFGENs folder can be deleted and recreated entirely from the info made available in a .ecp file
  • an EIFGENs folder can be opened in the state it was left after last compilation from the info made available in a .ecp file

The configuration file: .ecf

The .ecf file remains as is now, containing the definition of the system to compile, with eventually several targets (but no information relative to an actual compilation is stored in there). See Configuration

If a user double-clicks or otherwise tries to open a project given only the .ecf file, then the following happens:

  • in batch mode, an error is issued stating that a project can not be opened from a .ecf file
  • in GUI mode, recently compiled projects corresponding to the .ecf are shown to the user, which has the option of selecting one of them, or browsing to the actual project location)

The project file: .ecp

A .ecp file is similar to the .ecf file, but contains only project specific info. It is also an XML file serialized from a set of dedicated Eiffel objects.

A project may be opened from a .ecp file easily, since it holds all the needed info.

An .ecp file holds all the info relative to the current compilation, this info is (exhaustively):

  • the path to the .ecf file
  • a list of settings per compilation target
  • the 'last used target' (for convenience)
  • the 'project path' containing the EIFGENs
  • each 'compilation target' contains the following info:
    • the name of the target
    • the version of estudio that compiled the target
    • the name of the machine that build the project
    • the username of the account under which the project was built
    • the working directory that the user chose (by default it's the folder containing the .ecp file)
    • the last command line arguments the project was started with (in debug mode)
    • the list of used command line arguments (for convenience)
    • are command line arguments 'active'
    • all the used environment variables (an environment variable is 'used' if it appears in the .ecf file)

The .ecp file should not be a stored Eiffel object for 2 reasons:

  • it depends on the version of storable used, and presents retrieval problems if the API of the stored object is modified
  • if contributors want to open that storable, they have to have the right Eiffel class in their system

.ecp XML structure

<?xml version="1.0" encoding="ISO-8859-1"?>
<project
	xmlns="http://www.eiffel.com/developers/xml/project-1-0-0"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://www.eiffel.com/developers/xml/project-1-0-0 http://www.eiffel.com/developers/xml/project-1-0-0.xsd"
	>
	<settings
		ecf="/path/to/ecf/sample.ecf"
		last_target="target1"
		project_path="/projects/sample"
	/>
	<target
		name="target1"
		estudio="5.7.58953"
		host="zebra3.example.com"
		username="marty"
		working_directory="/projects/testing"
	>
		<command_line active="true" last="--help">
			<variable value="--help">
			<variable value="--use this and that">
			<variable value="--crack a smile">
			...
		</command_line>
		<environment>
			<variable name="ISE_EIFFEL" value="/path/to/estudio/5.7.58953">
			<variable name="GOBO" value="/path/to/gobo">
			...
		</environment>
	</target>
	<target name="target2">
		...
	</target>
</project>

ECP classes

indexing
	description: "Describes an Eiffel project"
	date: "$Date: $"
	revision: "$Revision: $"
 
class
	ECP_PROJECT
 
feature
 
	ecf: STRING
			-- Path to .ecf file
 
	last_target: STRING
			-- Name of last used target
 
	project_path: STRING
			-- Path to folder containing the EIFGENs generated folder
 
	targets: LIST [ECP_TARGET]
			-- The configuration targets.
 
end
indexing
	description: "Describes an Eiffel project target"
	date: "$Date: $"
	revision: "$Revision: $"
 
class
	ECP_TARGET
 
feature
 
	name: STRING
			-- Name of the target
 
	estudio: STRING
			-- Version of estudio chosen to compile the target
 
	host: STRING
			-- Name of the machine (host) where the target was compiled
 
	username: STRING
			-- Username under which the compilation took place
 
	working_directory: STRING
			-- Working directory selected by the user
			-- Optional, if not specified, the working folder is the folder containing the .ecp file
 
	use_arguments: BOOLEAN
			-- Use arguments?
 
	last_argument: STRING
			-- Last used argument.
 
	arguments: LIST [STRING]
			-- List of arguments used by current target
 
	evironment_variables: LIST [ECP_ENVIRONMENT_VARIABLE]
			-- List of environment variables refered to in the .ecf file for current target
 
end

What does this bring us?

  • it is possible to open an eiffel project from a .ecp file, with the right version of estudio, environment settings etc.
  • is is possible to recompile a project from scratch from a .ecp file easily (all the needed info is there)
  • if environment variables changed since the project was compiled, the user can be warned

What to do with the 'environment settings'

Why do we keep the 'used environment variables'? Why do they matter? Typically, when compiling a project, developers refer to clusters using environment variables. But what should happen if compilation1 used var1, and compilation2 uses a different value for var1? Right now, nothing special happens, compilation2 simply goes on, not even knowing that var1 changed...

It is however important to detect this, and it is very easy to implement too. Many situations occur that can lead to compilation errors and corrupted projects due to this environment-variable situation. Problems typically occur in companies with many users and machines juggling with a set of different environments. Also, a project is often setup with all the right environment variables, but then users want to later on simply open a given project without worrying what environment variables were used to set it up, keeping these in the .ecp file would allow for "simply double-click the .ecp file to open a project" kind of thing.

The following suggestion should allow to take care of this problem elegantly.

  • When a compilation is initiated, the 'env vars' list in the project settings file is empty
  • The following steps would be implemented in an 'initial pass' (before degree 6) by the compiler
  • Each time the compile encounters an env var in the .ecf file (and only there), it looks at what was previously present in the .ecp:
    • if the env var is used for the first time (not yet defined in the .ecp) then set it in the .ecp
    • if the env var was already there, then compare it:
      • if the current env var value is equal to the one stored in the .ecp, then it's good, just continue compilation
      • if they differ, then remember they differed and remeber both values
    • do the above for all env vars refered to in the .ecf
  • Now at the end of this "initial pass" we have a set of "differing" env vars
    • if the set is empty, go on to degree 6 as usual
    • if the set is not empty, then proceed as follows (2 different situations)
      • in GUI mode, popup a modal dialog showing the differences and asking the user whether it's OK to continue (a nice touch would be to let the user choose which version to use for each env var, old or new version)
      • in batch mode, 2 different cases
        • abort compilation if for example a -strict flag is specified on the command line
        • continue compilation with a warning listing the diffs otherwise
  • Apply the same logic for the "hostname" and "username" info in the .ecp file (also before degree 6)
    • if the project is being opened on a different machine than the one that created the last .ecp:
      • popup the warning dialog, asking the user whether it's OK to continue opening the project or not
      • in batch mode:
        • fail if -strict is specified
        • warn otherwise
  • If the compilation proceeds (either because user said OK, or because -strict was not specified), overwrite the info in the .ecp with the new values
  • The same logic could be applied to other pieces of info in the .ecp (such as project path etc.. some users open their projects from local folders such as C:\projects on Windows, as well as UNC folders such as \\MACHINE\projects, sometimes at the same time!)

Handling projects being opened several times at once

Typically, if the same project (same EIFGENs) is opened several times (from different machines for example) at once, there's a risk of the project getting corrputed There is a very simple way to handle this problem elegantly:

  • create a file in /<project path>/EIFGENs/<target>/ec.lock as soon as the project is open and put the following info in it:
    • the estudio version
    • the hostname and username opening the project
    • the date and time when it was opened
    • the "process_id" of the compiler opening the project (and creating this .lock file)
  • delete this file as soon as the project is closed (ie, the batch job exits, or the user closes the IDE)
  • if the compiler crashes and doesn't get the chance to close and delete this file, it's OK
  • if when opening the project the file already exists, then 2 situations:
    • in batch mode:
      • exit immediately with an error saying that the "ec.lock" file needs to be deleted (either manually or by closing the other running process) before the project can be opened in batch mode
    • in GUI mode:
      • popup a dialog asking the user what to do, the user has the following info and choices:
      • the popup shows which user/host/date has the project opened
      • "Open read-only" - allows the user to open the project without the capability to compile it (as to avoid corrupting the project)
      • "Open anyway, I know it's not open" - this option deletes the .lock file and recreates it (this would typically happen if the compiler crashed and did not get a chance to delete this .lock file)
      • "Cancel" project opening operation

Suggested format for this .lock file is also XML, here's an example of such a file:

<?xml version="1.0" encoding="ISO-8859-1"?>
<project
	xmlns="http://www.eiffel.com/developers/xml/project-1-0-0"
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://www.eiffel.com/developers/xml/project-1-0-0 http://www.eiffel.com/developers/xml/project-1-0-0.xsd"
	>
	<lock
		estudio="5.7.58953"
		host="zebra3.example.com"
		username="marty"
		date="2006/05/28 17:56:27"
		pid="2451"
	/>
</project>

The hostname and username info has an obvious added value (one knows who has the project open and where) The 'process_id' info would allow third-party Eiffel project management tools to "communicate" eventually with the ec.exe process that has the project open in the future, or kill the process, or change its priority etc.

indexing
	description: "Describes an Eiffel project lock file"
	date: "$Date: $"
	revision: "$Revision: $"
 
class
	ECP_LOCK
 
feature
 
	estudio: STRING
			-- Version of estudio that created the .lock file
 
	host: STRING
			-- Name of the machine (host) where the .lock file was created
 
	username: STRING
			-- Username under which the .lock file was created
 
 
	date: STRING
			-- Date when .lock file was created
			-- Maybe this does not need to be an attribute...
			-- The creation date of the file itself should give same info anyway
 
	pid: INTEGER
			-- Process id of the process that created this lock file
 
end


Command-line arguments for 'ec' and 'estudio'

'ec' and 'estudio' should accept similar command line arguments for coherence

  • To create a project, user must provide 2 arguments: -project_path and -config
  • To simply open a project, the user provides -project argument

'ec' command line arguments (relative to project management)

-config <ecf>
Path to .ecf file to compile, when this argument is passed, -project_path must be specified as well
-project_path <folder>
Specifies the folder where to create the .ecp file (same name as given .ecf file). If a .ecp file is already there, it is overwritten (recreated)
-project <ecp>
Path to project to open.
-target <target>
Target to compile (optional, if not provided, 'last_target' from .ecp is used)
-strict
Flag indicating whether environment settings should match accross compilations
If not specified, compiler issues a warning if environment variables modified since last compilation
If specified, compiler aborts with an error listing mismatching environment variables
  • Command line arguments -config and -project_path must be specified together for project creation, when these 2 arguments are provided:
    • a new .ecp is always created (existing .ecp is deleted and recreated)
    • any existing EIFGENs is deleted prior to compilation
    • EIFGENs folder is stored in same 'project_path' folder as the .ecp
    • users can choose a different path for the EIFGENs folder either by:
      • creating a .ecp file themselves
      • using the graphical IDE, which could allow for such customization (ec.exe in batch mode does not)
  • -project is used to open an alrady compiled once project
  • the pair [-project_path, -config] is exclusive with -project (users can not specify the -project along with -project_path or -config)
  • -target is always optional and may be provided for both project creation or project opening
  • -strict makes sense only for project opening

'estudio' command line arguments (relative to project management)

-config <ecf>
Path to .ecf file to compile, when this argument is passed, -project_path must be specified as well
-project_path <folder>
Specifies the folder where to create the .ecp file (same name as given .ecf file). If a .ecp file is already there, it is overwritten (recreated)
-project <ecp>
Path to project to open.
-target <target>
Target to compile (optional, if not provided, 'last_target' from .ecp is used)

Similar rules as above apply. The graphical IDE should ask questions to user only if something left unspecified.

Scenarios

  • Create a new project
ec -batch -project_path /projects/sample -config /path/to/ecf/sample.ecf -freeze -c_compile
estudio -project_path /projects/sample -config /path/to/ecf/sample.ecf -freeze -c_compile
  • Open an existing project
ec -batch -project /projects/sample/sample.ecp -freeze -c_compile
estudio -project /project/sample/sample.ecp
  • Double-clicking on a .ecp file is equivalent to estudio -project <ecp>
  • Double-clicking on a .ecf file is handled like so:
    • if a .ecp file with same name as the .ecf is present in same folder, open the .ecp file in that same folder
    • otherwise, prompt the user to choose which .ecp file to use
      • list all recently compiled projects that could match
      • allow user to 'browse' to .ecp location
      • allow user to create a new project out of the .ecp file

Pseudo-code algorithm

  • .ecp files are processed once when the project is opened (before first degree 6)
if arguments.project_path or arguments.config then
	if arguments.project then
		abort ("specify either -project or [-project_path + -config]")
	elseif not arguments.project_path then
		abort ("missing -project_path argument, [-project_path + -config] go together")
	elsief not arguments.config then
		abort ("missing -config argument, [-project_path + -config] go together")
	end
	create_new_project (arguments.project_path, arguments.config)
elseif arguments.project then
	if arguments.project.extension = "ecp" then
		open_project (arguments.project)
	elseif arguments.project.extension = "epr" then
		convert_old_epr (arguments.project)
	else
		abort ("Unsupported project file")
	end
else
	?
end
 
create_new_project (a_project_path, a_config: STRING) is
	do
		--create new .ecp file
		--delete EIFGENs if present
		open_project (created_ecp_file.path)
	end
 
open_project (a_ecp_path: STRING) is
	do
		create ecp.make (a_ecp_path)
		--check that associated .ecf file exists
		--go through .ecf file and list all environment variables it uses
		--go through all env vars used and:
			--get their 'old' value from .ecp (if any)
			--get their 'new' current value (given by shell)
			if var.is_mismatch then
				mismatches.extend (var)
			end
		--apply similar comparison for ecp.host and ecp.username
		if not mismatches.is_empty then
			if gui_mode then
				-- Show mismatches to user and ask him to pick the correct values, and allow him to cancel compilation altogether
			elseif arguments.strict then
				abort ("Mismatches: " + mismatches)
			else
				warn ("Mismatches: " + mismatches)
			end
		end
		--create or update .ecp file, store 'new' env vars values
		--check that EIFGENs exists, if not create it
		--create .lock file
		--project is opened, we may start compiling it
	end