Skip to content

Conversation

@Iximiel
Copy link
Member

@Iximiel Iximiel commented Sep 10, 2025

Description

Note that this PR is built on top of #1092

@GiovanniBussi, @gtribello, as we discussed last week I started this new approach to the PTM: the accelerated version is now served separately as a plugin.
This has a few advantages:

  • I can test the compilation on the CI without risking NVHPC blocking itself on some omp pragmas
  • You can use Plumed on the CPU compiled with gnu/clang along with the accelerated one
  • I have somehow implemented a mixed precision version of plumed (we have to discuss a few things about it)
  • starting from the overridden PTM in the plugin we can try different new acceleration strategies

I tried to avoid propagating the template declarations the more I could, but I did not always succeed

Looking at the diff some extra changes have slipped in, like the ones in check and some iostream inclusions.

This is only a WIP, only the standard multicolvars are included in the plugin, but I am working on expanding them

Target release

I would like my code to appear in release v2.11

Type of contribution
  • changes to code or doc authored by PLUMED developers, or additions of code in the core or within the default modules
  • changes to a module not authored by you
  • new module contribution or edit of a module authored by you
Copyright
  • I agree to transfer the copyright of the code I have written to the PLUMED developers or to the author of the code I am modifying.
  • the module I added or modified contains a COPYRIGHT file with the correct license information. Code should be released under an open source license. I also used the command cd src && ./header.sh mymodulename in order to make sure the headers of the module are correct.
Tests
  • I added a new regtest or modified an existing regtest to validate my changes.
  • I verified that all regtests are passed successfully on GitHub Actions.

constexpr TensorTyped& operator +=(const TensorTyped& b);
/// decrement
TensorGeneric& operator -=(const TensorGeneric<n,m>& b);
constexpr TensorTyped& operator -=(const TensorTyped& b);

Check notice

Code scanning / CodeQL

Unused static variable Note

Static variable arg is never read.
Comment on lines 185 to 209
constexpr TensorTyped<T,m,n> transpose()const;
/// << operator.
/// Allows printing tensor `t` with `std::cout<<t;`
template<unsigned n_,unsigned m_>
friend std::ostream & operator<<(std::ostream &os, const TensorGeneric<n_,m_>&);
/// Diagonalize tensor.
/// Syntax is the same as Matrix::diagMat.
/// In addition, it is possible to call if with m_ smaller than n_. In this case,
/// only the first (smaller) m_ eigenvalues and eigenvectors are retrieved.
/// If case lapack fails (info!=0) it throws an exception.
/// Notice that tensor is assumed to be symmetric!!!
template<unsigned n_,unsigned m_>
friend void diagMatSym(const TensorGeneric<n_,n_>&,VectorGeneric<m_>&evals,TensorGeneric<m_,n_>&evec);
/// Compute lowest eigenvalue and eigenvector, using a branchless iterative implementation.
/// This function is amenable for running on openACC.
/// The accuracy could be controlled increasing the number of iterations. The default value (24)
/// seems sufficient for most applications.
/// In principle, it could be extended to compute m eigenvalues by:
/// - computing the lowest
/// - compute the proper projector and reduce the matrix to <n-1,n-1>
/// - proceed iteratively
/// The interface should then be likely the same as diagMatSym
template<unsigned n_>
double lowestEigenpairSym(const TensorGeneric<n_, n_>& K, VectorGeneric<n_>& eigenvector, unsigned niter);
friend std::ostream & operator<< <>(std::ostream &os, const TensorTyped&);
};

template <unsigned n,unsigned m>
void TensorGeneric<n,m>::auxiliaryConstructor()
template<typename T, unsigned n, unsigned m>
constexpr void TensorTyped<T,n,m>::auxiliaryConstructor()
{}

template <unsigned n,unsigned m>
template<typename T, unsigned n, unsigned m>
template<typename... Args>
void TensorGeneric<n,m>::auxiliaryConstructor(double first,Args... arg) {
constexpr void TensorTyped<T,n,m>::auxiliaryConstructor(T first,Args... arg) {
d[n*m-(sizeof...(Args))-1]=first;
auxiliaryConstructor(arg...);
}

template <unsigned n,unsigned m>
template<typename T, unsigned n, unsigned m>
template<typename... Args>
TensorGeneric<n,m>::TensorGeneric(double first,Args... arg) {
constexpr TensorTyped<T,n,m>::TensorTyped(T first,Args... arg) {

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
@Iximiel Iximiel changed the title Parallel tasck manager accelerated as a plugin Parallel task manager accelerated as a plugin Sep 10, 2025
@Iximiel Iximiel force-pushed the ptmAccPlug-template branch from 3ab2517 to 5e86861 Compare September 11, 2025 13:05
keys.addFlag("VERBOSE",false,"write a more detailed output");
keys.reset_style("VERBOSE","hidden");
if( keys.getDisplayName()!="SECONDARY_STRUCTURE_DRMSD" && keys.getDisplayName()!="SECONDARY_STRUCTURE_RMSD" ) {
//if( keys.getDisplayName()!="SECONDARY_STRUCTURE_DRMSD" && keys.getDisplayName()!="SECONDARY_STRUCTURE_RMSD" ) {

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
@Iximiel Iximiel force-pushed the ptmAccPlug-template branch from 5e86861 to d9c3b6a Compare September 11, 2025 15:18
const auto targetSize=target.size();

for (unsigned i=0; i<targetSize; ++i) {
const auto& [reference,k,j] = target[i];

Check notice

Code scanning / CodeQL

Unused local variable Note

Variable (unnamed local variable) is not used.
@Iximiel Iximiel force-pushed the ptmAccPlug-template branch from 3450686 to ab62229 Compare September 15, 2025 13:34
Comment on lines +352 to +359
//Vector origin_old, origin_new;
//origin_old=pos[21];
//origin_new=pos[6]+distance;

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
//origin_new=pos[6]+distance;
for(unsigned i=15; i<30; ++i) {
pos[i]+=( origin_new - origin_old );
//pos[i]+=( origin_new - origin_old );

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
@Iximiel Iximiel force-pushed the ptmAccPlug-template branch 3 times, most recently from 621c9a5 to 56fb8fe Compare September 17, 2025 15:22
keys.addActionNameSuffix("_GRID");
T tfunc;
tfunc.registerKeywords( keys );
//keys.addActionNameSuffix("_GRIDACC");

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
Comment on lines +100 to +103
//if( keys.getDisplayName()=="MATRIX_PRODUCT" ) {
// keys.addFlag("ELEMENTS_ON_DIAGONAL_ARE_ZERO",false,"set all diagonal elements to zero");
//}

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
nvectors = getNumberOfArguments()-1;
}
if( getName()=="MATRIX_VECTOR_PRODUCT_ROWSUMS" ) {
if constexpr ( helpers::isRowSum<CV>) { //getName()=="MATRIX_VECTOR_PRODUCT_ROWSUMS" ) {

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
Comment on lines +169 to +174
//_REGISTER_ACTION(colv,"SECONDARY_STRUCTURE_RMSD_CPU");
//typedef AccelerableShortcut<colv> shortcut;
//_REGISTER_ACTION(shortcut,"SECONDARY_STRUCTURE_RMSD");

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
@Iximiel Iximiel force-pushed the ptmAccPlug-template branch from 56fb8fe to ef358f5 Compare September 18, 2025 08:22
Comment on lines 538 to 539
/// Returns the images from the dlloader
const std::vector <void*>& getDLHandles() const noexcept;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GiovanniBussi I did this to use the action register's check with the extra handles within FunctionShortcut.
Is this a good idea?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle I do not see problems however:

  • I do not know how you use the result. This is supposed to be an internal detail, perhaps what you are doing can be directly encapsulated in an additional method of class PlumedMain
  • It is very difficult to see what this PR does because it is massive

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* I do not know how you use the result. This is supposed to be an internal detail, perhaps what you are doing can be directly encapsulated in an additional method of class PlumedMain

I use it in FunctionShortcut.h . By "encapsulated it as a method of PlumedMain", do you mean to create something like plumed->check instead of calling the action register singleton while exposing the void* vector?

* It is very difficult to see what this PR does because it is massive

I know, I should have moved the tests as the very last thing, because those are eclipsing the other files in the gh UI.
Apart from that, this branch is based on #1092, for the mixed precision things, so part of the changes can be discussed there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use it in FunctionShortcut.h . By "encapsulated it as a method of PlumedMain", do you mean to create something like plumed->check instead of calling the action register singleton while exposing the void* vector?

Yes exactly. The implementation based on void* is quite tricky, so I would rather add an ad hoc method to do (what you need to do)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is ok if I am calling this new method checkAction, like the API command that does exactly what I want?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds perfect

@Iximiel
Copy link
Member Author

Iximiel commented Sep 18, 2025

@gtribello @GiovanniBussi this is nearly done
I have to adjust a little the tests (that as now can only be run locally) but we can check the at least the compilation of the accelerated CV on the CI without disturbing the compilation of the main plumed library

@@ -147,9 +147,15 @@
error("shortcut is using suffix but action created is not ActionWithValue");
}
Keywords thiskeys;
//TODO: check if it is possible to do this: plumed.getKeywordsForAction(av->getName(),thiskeys);

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
@Iximiel Iximiel force-pushed the ptmAccPlug-template branch from 0bbeb75 to f921207 Compare December 4, 2025 15:52
@Iximiel Iximiel marked this pull request as ready for review December 4, 2025 15:53
@Iximiel
Copy link
Member Author

Iximiel commented Dec 4, 2025

Now the CI can test the compilation of the PTM with openacc.
It might need some extra massages because setting up the compiler in the plugin folder is everything but straighforward

error("cannot calculate dissimilarities for sparse matrices");
}
}
}
updateBookeepingArrays( taskmanager.getActionInput().outmat );
taskmanager.setupParallelTaskManager( 2*getPntrToArgument(0)->getNumberOfColumns(), getPntrToArgument(1)->getNumberOfStoredValues() );
//unsigned nvals = getPntrToComponent(0)->getNumberOfColumns();

Check notice

Code scanning / CodeQL

Commented-out code Note

This comment appears to contain commented-out code.
@Iximiel
Copy link
Member Author

Iximiel commented Dec 5, 2025

@gtribello can you help me in getting right the documentation for the new shorcuts?

The Intel error is already corrected, I will push it with the documentation fix

@gtribello
Copy link
Member

Hi @Iximiel

I had a look at this. Can't get the code to run on my laptop any more because of an update, but I have a suggestion on how to solve the issue with the manual. To be clear, the idea is that the DISSIMILARITY_CPU command would never be used in an input file. The input would always be:

DISSIMILARITY ...

If that is the case, then you just need to add the command:

keys.setDisplayName("DISSIMILARITY");

in the registerKeywords method of the actions that have a _CPU version. Adding this command hides the shortcut version from the manual builder. I hope this works.

@Iximiel
Copy link
Member Author

Iximiel commented Dec 15, 2025

Hi @Iximiel

I had a look at this. Can't get the code to run on my laptop any more because of an update, but I have a suggestion on how to solve the issue with the manual. To be clear, the idea is that the DISSIMILARITY_CPU command would never be used in an input file. The input would always be:

DISSIMILARITY ...

If that is the case, then you just need to add the command:

keys.setDisplayName("DISSIMILARITY");

in the registerKeywords method of the actions that have a _CPU version. Adding this command hides the shortcut version from the manual builder. I hope this works.

Thanks! I'll try that with DISSIMILARITY and MATRIX_PRODUCT
I am testing with curl disabled in the manual to speed up things

On my PC it DISSILMILARITY works, but MATRIX_PRODUCT(_CPU) not, it may be a question of getting the correct suffix?

@Iximiel
Copy link
Member Author

Iximiel commented Dec 16, 2025

@gtribello
I think it is something with the suffix and CV with underscores in the non suffixed names.
I'll try to do some massages to the code to make it work

edit: I may be wrong

@gtribello
Copy link
Member

gtribello commented Dec 16, 2025

Hi @Iximiel

I was looking at this again. I think the problem might be that DISSIMILARITIES and MATRIX_PRODUCT use the same template Action class. You can see this from how they are registered:

typedef MatrixTimesMatrix<MatrixProduct> mtimes;
PLUMED_REGISTER_ACTION(mtimes,"MATRIX_PRODUCT")
typedef MatrixTimesMatrix<Dissimilarities> dissims;
PLUMED_REGISTER_ACTION(dissims,"DISSIMILARITIES")

The registerKeywords method for these two actions is the same. Perhaps a solution is to include the following code in MatrixTimesMatrix<T>::registerKeywords.

if( keys.getDisplayName().find("MATRIX_PRODUCT")!=std::string::npos ) {
     keys.setDisplayName("MATRIX_PRODUCT");
     keys.addFlag("ELEMENTS_ON_DIAGONAL_ARE_ZERO",false,"set all diagonal elements to zero");
} else {
    keys.setDisplayName("DISSIMILARITIES");
}

I hope this helps

@Iximiel
Copy link
Member Author

Iximiel commented Dec 16, 2025

Hi @gtribello

On my pc the manual problem looks solved using your first suggestion.

@gtribello
Copy link
Member

Oh jolly good @Iximiel. It looks like it is solved here, too, as the doc-mpi is now building.

@Iximiel
Copy link
Member Author

Iximiel commented Dec 16, 2025

mac 13 is clinically dead as a runner, and the codeQL is not linked to this PR (but may be addressed)
I think it is done

preparing distance for the new plugin

Setting up the new ptm in the new plugin

deactivating a check in ActionShortcut::readInputLine

setting up USEGPU for distance_vector

Templatization of Angle and Torsion

Mixed precision in Pbcs

Mixed precision for GetInputData

moved the multicolvar with openacc in a plugin

Starting the migration of multicolvars to a template

adding some templatization to the PTM

updating the PTM accelerated

now multicolvar is ready to be compiled in the openacc plugin

adding  the plugins to the modulemap

setting up secondaryStructure for the plugin

setting up secondary structure to target the plugin

addressing some codecheck problems

moving some of the volume kw in the headers

moved Volumes to the plugin

added the spherical harmonic to the plugin

adding the matrixtimes family

fixing a warning

added Combine

Between and <= and >=

setting up contact matrix

separating the GPU from the PTM

moved the openACC tests

updating some tests

adding a way to get the dl handles in the actions

addressed some codechek problems

now checkin if an action is registered do note expose the vector of
LOADed handles

now the tests for existing GPU implementation pass, added fccubic, and
better test

compler guard to fccubic

removing an unecessary shadowed variable

now also the gpu-torsion test is checked against the CPU

deactivated the quaternion tests

small changes after the rebase

deleted a support file
@Iximiel Iximiel force-pushed the ptmAccPlug-template branch from 8338b3d to 7d6e556 Compare December 16, 2025 12:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants