A computer implemented method for optimizing performance of workflow includes associating each of a plurality of workflow nodes in a workflow with a data cache and managing the data cache on a local storage device on one of one or more compute nodes. A scheduler can request execution of the tasks of a given one of the plurality of workflow nodes on one of the one of more compute nodes that hosts the data cache associated with the given one of the plurality of workflow nodes. Each of the plurality of workflow nodes is permitted to access a distributed filesystem that is visible to each of the plurality of compute nodes. The data cache stores data produced by the tasks of the given one of the plurality of workflow nodes.