Title: Multiple Nonzero-Rank Part References Submitted By: Aleksandar Donev Status: For Consideration References: J3/03-253 Basic Functionality: I propose to delete the constraint that prohibits multiple nonzero rank part-refs: "In a data-ref, there shall be no more then one part-ref with nonzero rank." There is no justification for this constraint, and removing it would unleash a most useful capability which Fortran is uniquely capable of with its ability to deal with non-contiguous arrays. Rationale: The proposed functionality gives two gains: 1) It allows for a kind of separation between the implementation of operations on data and the way the data is actually stored which is unprecedented in other languages. This kind of separation is much more flexible and easy to use then inheritance-based methods (but is more limited in that only data, not methods, are covered). An example includes the ability to code a computational geometry package which operates on a collection of points, without specifically indicating how the coordinates of the points are stored - in a simple multidimensional array, or inside some complicated hierarchy of derived types. 2) It allows the use of all the powerful array syntax and intrinsics for data stored inside derived types. Take the simple example: TYPE Point3D ! A point in 3D REAL :: coordinates(3), data(2) END TYPE Point3D TYPE(point3D), DIMENSION(10) :: points ! A collection of points Finding the centroid of the selected points would be performed with, WRITE(*,*) "The centroid is", SUM(points%coordinates, DIM=2) / SIZE(points) which requires no loops. Even more useful would be the ability to pass the coordinates of the selected points to a procedure (note that this procedure need not know that the coordinates came from an array of derived type point3D). Estimated Impact: The edits needed to implement this are small and localized to Section 6.1.2 (examples are given under Specification). References with multiple non-zero part-refs are treated in all respects like data-refs with just a single non-zero rank part-ref, namely, they are array sections. Therefore I estimate that no other part of the standard will need to be changed. The implementation of this feature does require some nontrivial work. However, the steps involved are very similar to the way current data-refs and array pointers/sections are handled. I have implemented extensions for the three compilers I use to be able to use such structure components in only a hundred lines of Fortran + C code. I essentially use low-level C code which manipulates the compiler's array descriptors to create an higher rank array pointer to the data-refs I need, and then I can use the array pointer when I need to access the data as a multi-rank array (see my Fortran Forum article). Detailed Specification: The main edits needed are the following: Delete "In a data-ref, there shall be no more then one part-ref with nonzero rank". Then add constraint The rank of a data-ref is the sum of the ranks of the part-refs with nonzero rank, if any; otherwise, the rank is zero. ... Cxxx: The maximum rank of a data-ref shall be 7. and change the way the rank of data-refs is determined: The rank and shape of a nonzero rank part-ref are determined as follows. If the part-ref has no section-subscript-list, the rank and shape are those of part-name. Otherwise, the rank is the number of subscript triplets and vector subscripts in section-subscript-list, and the shape is the rank-1 array whose i-th element is the number of integer values in the sequence indicated by the i-th subscript triplet or vector subscript. If any of these sequences is empty, the corresponding element in the shape is zero. In an array-section, the rank of the array is the sum of the ranks of the nonzero rank part-refs. The shape of the array is the rank-1 array obtained by concatenating the shapes of the nonzero rank part-refs, in backward order, i.e., starting from the last one. If the shape has an element with the value of zero, the array section has size zero. There are some other edits that will be needed, mostly in Section 6.1.2. The Shape of the data-ref A problem in the proposal as described above is that the Fortran order of specifying components, structure%component, as opposed to the alternative component%structure, is the opposite of the order of concatenation of the shapes of the non-zero rank references. For example, the reference: level1(1:4,1:5,1:6)%level2(1:2,1:3)%level3(1:1) represents an array section of shape (/1,2,3,4,5,6/), and not (/4,5,6,2,3,1/) as might be thought at first. However, this is the best choice, for both the compiler and the standard and the user, despite the extra cost of having to be careful with indices in certain situations. I believe the wrong choice was made when component references were chosen to follow the C-style ordering of object%component instead of component%object. This cannot be changed now without introducing a whole new syntax and the associated cost for users and implementors. Instead, we should choose the proposed shape for the data-ref that I describe here and accept the loss of simplicity in the syntax as unavoidable due to past mistakes. History: Many debates during the design of F8x... Comments: John Reid, JKR Associates, Oxford: I would like to suggest that we allow arrays of arrays, such as a(:,:)%comp(:,:) They are not allowed because when such an array is passed to a dummy argument dum, a(i,j)%comp(k,l) corresponds to dum(k,l,i,j) and the more array parts there are, the more confusing it is seen to be. However, I think we could get used to the rule and it is not too hard to state. Personally, I would prefer a shorter and simpler proposal and am prepared to work on it if we decide in favour. I discussed this with Lawrie Schonfelder some time ago by e-mails and he wants it. Malcolm Cohen, Nihon NAG, Tokyo: This seems semi-reasonable, HOWEVER (i) we still need to maintain that no pointer or allocatable component can occur after a nonzero rank part. (ii) it is seriously limited without expanding our current 7-dim limit. Expanding the 7-dim limit costs (it makes runtime library routines that traverse an array bigger - they all need rewriting). Overall, I'd definitely put this feature as being lower priority than expanding our current dimension limit. SUMMARY: (1) More important to have more dimensions; pick a number (15 is the smallest number J3 came up with, and is the largest one I'd want to see). (2) I don't see this particular proposal as being terribly important.